PHP
downloads | documentation | faq | getting help | mailing lists | reporting bugs | php.net sites | links | conferences | my php.net

search for in the

strcasecmp> <str_split
Last updated: Fri, 20 Jun 2008

view this page in

str_word_count

(PHP 4 >= 4.3.0, PHP 5)

str_word_count — Compte le nombre de mots utilisés dans une chaîne

Description

mixed str_word_count ( string $string [, int $format [, string $charlist ]] )

str_word_count() compte le nombre de mots dans la chaîne string . Si le paramètre optionnel format n'est pas spécifié, alors la valeur retournée sera un entier, représentant le nombre de mots trouvés. Si format est spécifié, la valeur retournée sera un tableau, qui dépend du format format . Les valeurs possibles pour format sont listées plus bas.

Dans cette fonction, la notion de "mot" dépend de la configuration de localisation. C'est une chaîne qui contient tous les caractères alphabétiques, et qui peut contenir, mais pas commencer par "'" et "-".

Liste de paramètres

string

La chaîne de caractères

format

Spécifie la valeur de retour de cette fonction. Les valeurs actuellement supportées sont :

  • 0 - retourne le nombre de mots trouvés
  • 1 - retourne un tableau contenant tous les mots trouvés à l'intérieur de string
  • 2 - retourne un tableau associatif, où la clé indique la position numérique du mot à l'intérieur de string et la valeur est le mot actuel

charlist

Une liste des caractères additionnels qui seront considérés comme un 'mot'

Valeurs de retour

Retourne un tableau ou un entier, dépendamment du format choisi.

Historique

Version Description
5.1.0 Ajout du paramètre charlist

Exemples

Exemple #1 Exemple avec str_word_count()

<?php

$str 
"Salut l'ami, vous
        avez          une b3lle mine !"
;

print_r(str_word_count($str1));
print_r(str_word_count($str2));
print_r(str_word_count($str1'àáãç3'));

echo 
str_word_count($str);

?>

L'exemple ci-dessus va afficher :


Array
(
    [0] => Salut
    [1] => l'ami
    [2] => vous
    [3] => avez
    [4] => une
    [5] => b
    [6] => lle
    [7] => mine
)

Array
(
    [0] => Salut
    [6] => l'ami
    [13] => vous
    [26] => avez
    [40] => une
    [44] => b
    [46] => lle
    [50] => mine
)

Array
(
    [0] => Salut
    [1] => l'ami
    [2] => vous
    [3] => avez
    [4] => une
    [5] => belle
    [6] => mine
)

8



strcasecmp> <str_split
Last updated: Fri, 20 Jun 2008
 
add a note add a note User Contributed Notes
str_word_count
aspu.ru
17-Jun-2008 01:21
str_word_count: mixed (string string, [int format], [string charlist])

It can help you to solve problem with digest and some locales. Best regards.
robocop at robotix dot fr
30-Mar-2008 04:21
function count_words($texte)
{
$texte=trim($texte);
$motsinutiles = array(' * ', ' - ', ' : ', '\n');
$texte = str_replace($motsinutiles, '', $texte);
$texte = preg_replace("/\s\s+/", " ", $texte);
$decoupeapostrophes = count(explode('\'', $texte)); //On découpe la chaine en apostrophes
   if($decoupeapostrophes==0) $nombreapostrophes = 0;
   if ($decoupeapostrophes%2==0) {$nombreapostrophes = $decoupeapostrophes/2;}
   else  $nombreapostrophes = ($decoupeapostrophes/2)-0.5;
$nombreespace = count(explode(' ', $texte));

return $nombreespace+$nombreapostrophes;   
}
security_man
24-Dec-2007 10:13
there was a glitch in the code cathy put a post or two ago... should be:

    function limit_text($text, $limit) {
    $text = strip_tags($text);
      $words = str_word_count($text, 2);
      $pos = array_keys($words);
      if (count($words) > $limit) {
          $text = substr($text, 0, $pos[$limit]) . ' ...';
      }
    return $text;
    }

I also added the strip tags in case there is html in there to gum up the works
Adeel Khan
09-Dec-2007 05:01
<?php

/**
 * Returns the number of words in a string.
 * As far as I have tested, it is very accurate.
 * The string can have HTML in it,
 * but you should do something like this first:
 *
 *    $search = array(
 *      '@<script[^>]*?>.*?</script>@si',
 *      '@<style[^>]*?>.*?</style>@siU',
 *      '@<![\s\S]*?--[ \t\n\r]*>@'
 *    );
 *    $html = preg_replace($search, '', $html);
 *
 */

function word_count($html) {

 
# strip all html tags
 
$wc = strip_tags($html);

 
# remove 'words' that don't consist of alphanumerical characters or punctuation
 
$pattern = "#[^(\w|\d|\'|\"|\.|\!|\?|;|,|\\|\/|\-|:|\&|@)]+#";
 
$wc = trim(preg_replace($pattern, " ", $wc));

 
# remove one-letter 'words' that consist only of punctuation
 
$wc = trim(preg_replace("#\s*[(\'|\"|\.|\!|\?|;|,|\\|\/|\-|:|\&|@)]\s*#", " ", $wc));

 
# remove superfluous whitespace
 
$wc = preg_replace("/\s\s+/", " ", $wc);

 
# split string into an array of words
 
$wc = explode(" ", $wc);

 
# remove empty elements
 
$wc = array_filter($wc);

 
# return the number of words
 
return count($wc);

}

?>
Cathy
19-Jul-2007 10:16
A cute little function for truncating text to a given word limit:
<?php
   
function limit_text($text, $limit) {
      if (
strlen($text) > $limit) {
         
$words = str_word_count($text, 2);
         
$pos = array_keys($words);
         
$text = substr($text, 0, $pos[$limit]) . '...';
      }
      return
$text;
    }
?>
geertdd at gmail dot com
13-Jun-2007 06:04
This is an update to my previously posted word_limiter() function. The regex is even more optimized now. Just replace the preg_match line. Change to:

<?php
preg_match
('/^\s*(?:\S+\s*){1,'. (int) $limit .'}/', $str, $matches);
geertdd at gmail dot com
28-May-2007 01:52
Here's a very fast word limiter function that preserves the original whitespace.

<?php

function word_limiter($str, $limit = 100, $end_char = '&#8230;') {
   
    if (
trim($str) == '')
        return
$str;
   
   
preg_match('/\s*(?:\S*\s*){'. (int) $limit .'}/', $str, $matches);

    if (
strlen($matches[0]) == strlen($str))
       
$end_char = '';

    return
rtrim($matches[0]) . $end_char;
}

?>

For the thought process behind this function, please read: http://codeigniter.com/forums/viewthread/51788/

Geert De Deckere
joshua dot blake at gmail dot com
03-Mar-2007 02:02
I needed a function which would extract the first hundred words out of a given input while retaining all markup such as line breaks, double spaces and the like. Most of the regexp based functions posted above were accurate in that they counted out a hundred words, but recombined the paragraph by imploding an array down to a string. This did away with any such hopes of line breaks, and thus I devised a crude but very accurate function which does all that I ask it to:

function Truncate($input, $numWords)
{
  if(str_word_count($input,0)>$numWords)
  {
    $WordKey = str_word_count($input,1);
    $PosKey = str_word_count($input,2);
    reset($PosKey);
    foreach($WordKey as $key => &$value)
    {
        $value=key($PosKey);
        next($PosKey);
    }
    return substr($input,0,$WordKey[$numWords]);
  }
  else {return $input;}
}

The idea behind it? Go through the keys of the arrays returned by str_word_count and associate the number of each word with its character position in the phrase. Then use substr to return everything up until the nth character. I have tested this function on rather large entries and it seems to be efficient enough that it does not bog down at all.

Cheers!

Josh
josh at joshblake.net
02-Mar-2007 12:57
I was interested in a function which returned the first few words out of a larger string.

In reality, I wanted a preview of the first hundred words of a blog entry which was well over that.

I found all of the other functions which explode and implode strings to arrays lost key markups such as line breaks etc.

So, this is what I came up with:

function WordTruncate($input, $numWords) {
if(str_word_count($input,0)>$numWords)
{
    $WordKey = str_word_count($input,1);
    $WordIndex = array_flip(str_word_count($input,2));
    return substr($input,0,$WordIndex[$WordKey[$numWords]]);
}
else {return $input;}
}

While I haven't counted per se, it's accurate enough for my needs. It will also return the entire string if it's less than the specified number of words.

The idea behind it? Use str_word_count to identify the nth word, then use str_word_count to identify the position of that word within the string, then use substr to extract up to that position.

Josh.
30-Jan-2007 05:15
Here is a php work counting function together with a javascript version which will print the same result.

<?php
     
//Php word counting function
     
function word_count($theString)
      {
       
$char_count = strlen($theString);
       
$fullStr = $theString." ";
       
$initial_whitespace_rExp = "^[[:alnum:]]$";
       
       
$left_trimmedStr = ereg_replace($initial_whitespace_rExp,"",$fullStr);
       
$non_alphanumerics_rExp = "^[[:alnum:]]$";
       
$cleanedStr = ereg_replace($non_alphanumerics_rExp," ",$left_trimmedStr);
       
$splitString = explode(" ",$cleanedStr);
       
       
$word_count = count($splitString)-1;
       
        if(
strlen($fullStr)<2)
        {
         
$word_count=0;
        }     
        return
$word_count;
      }
?>

<?php
     
//Function to count words in a phrase
     
function wordCount(theString)
      {
        var
char_count = theString.length;
        var
fullStr = theString + " ";
        var
initial_whitespace_rExp = /^[^A-Za-z0-9]+/gi;
        var
left_trimmedStr = fullStr.replace(initial_whitespace_rExp, "");
        var
non_alphanumerics_rExp = rExp = /[^A-Za-z0-9]+/gi;
        var
cleanedStr = left_trimmedStr.replace(non_alphanumerics_rExp, " ");
        var
splitString = cleanedStr.split(" ");
       
        var
word_count = splitString.length -1;
       
        if (
fullStr.length <2)
        {
         
word_count = 0;
        }     
        return
word_count;
      }
?>
Aurelien Marchand
06-Oct-2006 06:06
I found a more reliable way to print, say the first 100 words and then print elipses. My code goes this way;

$threshold_length = 80; // 80 words max
$phrase = "...."; // populate this with the text you want to display
$abody = str_word_count($phrase,2);
if(count($abody) >= $threshold_length){ // gotta cut
  $tbody = array_keys($abody);
  echo "<p>" . substr($phrase,0,$tbody[$threshold_length]) . "... <span class=\"more\"><a href=\"?\">read more</a></span> </p>\n";
} else { // put the whole thing
  echo "<p>" . $phrase . "</p>\n";
}

For any questions, com.iname@artaxerxes2
lwright at psu dot edu
17-Aug-2006 08:51
If you are looking to count the frequency of words, try:

<?php

$wordfrequency
= array_count_values( str_word_count( $string, 1) );

?>
rabin at rab dot in
05-Apr-2006 08:03
There is a small bug in the "trim_text" function by "webmaster at joshstmarie dot com" below. If the string's word count is lesser than or equal to $truncation, that function will cut off the last word in the string.

[EDITOR'S NOTE: above referenced note has been removed]

This fixes the problem:

<?php
function trim_text_fixed($string, $truncation = 250) {
   
$matches = preg_split("/\s+/", $string, $truncation + 1);
   
$sz = count($matches);
    if (
$sz > $truncation ) {
        unset(
$matches[$sz-1]);
        return
implode(' ',$matches);
    }
    return
$string;
}
?>
webmaster at joshstmarie dot com
26-Sep-2005 01:58
Trying to make an effiecient word splitter, and "paragraph limiter", eg, limit item text to 100, or 200 words and so-forth.

I don't know how well this compares, but it works nicely.

function trim_text($string, $word_count=100)
{
    $trimmed = "";
    $string = preg_replace("/\040+/"," ", trim($string));
    $stringc = explode(" ",$string);
    echo sizeof($stringc);
    if($word_count >= sizeof($stringc))
    {
        // nothing to do, our string is smaller than the limit.
      return $string;
    }
    elseif($word_count < sizeof($stringc))
    {
        // trim the string to the word count
        for($i=0;$i<$word_count;$i++)
        {
            $trimmed .= $stringc[$i]." ";
        }
       
        if(substr($trimmed, strlen(trim($trimmed))-1, 1) == '.')
          return trim($trimmed).'..';
        else
          return trim($trimmed).'...';
    }
}

$text = "some  test          text goes in here, I'm not sure, but ok.";
echo trim_text($text,5);
MadCoder
16-Aug-2005 06:12
Here's a function that will trim a $string down to a certian number of words, and add a...   on the end of it.
(explansion of muz1's 1st 100 words code)

----------------------------------------------
function trim_text($text, $count){
$text = str_replace("  ", " ", $text);
$string = explode(" ", $text);
for ( $wordCounter = 0; $wordCounter <= $count;wordCounter++ ){
$trimed .= $string[$wordCounter];
if ( $wordCounter < $count ){ $trimed .= " "; }
else { $trimed .= "..."; }
}
$trimed = trim($trimed);
return $trimed;
}

Usage
------------------------------------------------
$string = "one two three four";
echo trim_text($string, 3);

returns:
one two three...
jtey at uoguelph dot ca
15-Aug-2005 01:21
In the previous note, the example will only extract from the string, words separated by exactly one space.  To properly extract words from all strings, use regular expressions.

Example (extracting the first 4 words):
<?php
$string
= "One    two three       four  five six";
echo
implode(" ", array_slice(preg_split("/\s+/", $string), 0, 4));
?>

The above $string would not have otherwise worked when using the explode() method below.
jtey at uoguelph dot ca
14-Aug-2005 04:59
In reply to muz1's post below:

You can also take advantage of using other built in PHP functions to get to your final result.  Consider the following:
<?php
$string
= "One two three four five six seven eight nine ten.";
// the first n words to extract
$n = 3;
// extract the words
$words = explode(" ", $string);
// chop the words array down to the first n elements
$firstN = array_slice($words, 0, $n);
// glue the 3 elements back into a spaced sentence
$firstNAsAString = implode(" ", $firstN);
// display it
echo $firstNAsAString;
?>

Or to do it all in one line:
<?php
echo implode(" ", array_slice(explode(" ", $string), 0, $n));
?>
muz1 at muzcore dot com
12-Aug-2005 09:56
This function is awesome however I needed to display the first 100 words of a string. I am submitting this as a possible solution but also to get feedback as to whether it is the most efficient way of doing it.

<?
                                    $currString
= explode(" ", $string);
for (
$wordCounter=0; $wordCounter<100; $wordCounter++) { echo $currString[$wordCounter]." "; }
?>
16-Jan-2005 03:38
This function seems to view numbers as whitespace. I.e. a word consisting of numbers only won't be counted.
aix at lux dot ee
14-Nov-2004 11:53
One function.
<?php
if (!function_exists('word_count')) {
function
word_count($str,$n = "0"){
   
$m=strlen($str)/2;
   
$a=1;
    while (
$a<$m) {
       
$str=str_replace("  "," ",$str);
       
$a++;
        }
   
$b = explode(" ", $str);
   
$i = 0;
    foreach (
$b as $v) {
       
$i++;
        }
    if (
$n==1) return $b;
    else  return
$i;

    }
}
$str="Tere Tartu linn";
$c  = word_count($str,1); // it return an array
$d  = word_count($str); // it return int - how many words was in text
print_r($c);
echo
$d;
?>
aidan at php dot net
26-Jun-2004 12:02
This functionality is now implemented in the PEAR package PHP_Compat.

More information about using this function without upgrading your version of PHP can be found on the below link:

http://pear.php.net/package/PHP_Compat
Kirils Solovjovs
22-Feb-2004 06:06
Nothing of this worked for me. I think countwords() is very encoding dependent. This is the code for win1257. For other layots you just need to redefine the ranges of letters...

<?php
function countwords($text){
       
$ls=0;//was it a whitespace?
       
$cc33=0;//counter
       
for($i=0;$i<strlen($text);$i++){
               
$spstat=false; //is it a number or a letter?
               
$ot=ord($text[$i]);
                if( ((
$ot>=48) && ($ot<=57)) ||  (($ot>=97) && ($ot<=122)) || (($ot>=65) && ($ot<=90)) || ($ot==170) ||
                ((
$ot>=192) && ($ot<=214)) || (($ot>=216) && ($ot<=246)) || (($ot>=248) && ($ot<=254))  )$spstat=true;
                if((
$ls==0)&&($spstat)){
                       
$ls=1;
                       
$cc33++;
                }
                if(!
$spstat)$ls=0;
        }
        return
$cc33;
}

?>
Artimis
15-Oct-2003 11:32
Never use this function to count/separate alphanumeric words, it will just split them up words to words, numbers to numbers.  You could refer to another function "preg_split" when splitting alphanumeric words.  It works with Chinese characters as well.
andrea at 3site dot it
19-May-2003 01:55
if string doesn't contain the space " ", the explode method doesn't do anything, so i've wrote this and it seems works better ... i don't know about time and resource

<?php
function str_incounter($match,$string) {
$count_match = 0;
for(
$i=0;$i<strlen($string);$i++) {
if(
strtolower(substr($string,$i,strlen($match)))==strtolower($match)) {
$count_match++;
}
}
return
$count_match;
}
?>

example

<?php
$string
= "something:something!!something";
$count_some = str_incounter("something",$string);
// will return 3
?>
megat at megat dot co dot uk
19-Apr-2003 03:29
[Ed: You'd probably want to use regular expressions if this was the case --alindeman @ php.net]

Consider what will happen in some of the above suggestions when a person puts more than one space between words. That's why it's not sufficient just to explode the string.
olivier at ultragreen dot net
11-Apr-2003 03:10
I will not discuss the accuracy of this function but one of the source codes above does this.

<?php
function wrdcnt($haystack) {
 
$cnt = explode(" ", $haystack);
 return
count($cnt) - 1;
}
?>

That could be replace by

<?php
function wrdcnt($haystack) {
 return
substr_count($haystack,' ') + 1;
}
?>

I doubt this does need to be a function :)
philip at cornado dot com
07-Apr-2003 04:30
Some ask not just split on ' ', well, it's because simply exploding on a ' ' isn't fully accurate.  Words can be separated by tabs, newlines, double spaces, etc.  This is why people tend to seperate on all whitespace with regular expressions.
rcATinterfacesDOTfr
16-Jan-2003 04:58
Here is another way to count words :
$word_count = count(preg_split('/\W+/', $text, -1, PREG_SPLIT_NO_EMPTY));
brettNOSPAM at olwm dot NO_SPAM dot com
09-Nov-2002 09:06
This example may not be pretty, but It proves accurate:

<?php
//count words
$words_to_count = strip_tags($body);
$pattern = "/[^(\w|\d|\'|\"|\.|\!|\?|;|,|\\|\/|\-\-|:|\&|@)]+/";
$words_to_count = preg_replace ($pattern, " ", $words_to_count);
$words_to_count = trim($words_to_count);
$total_words = count(explode(" ",$words_to_count));
?>

Hope I didn't miss any punctuation. ;-)
gorgonzola at nospam dot org
31-Oct-2002 11:48
i tried to write a wordcounter and ended up with this:

<?php
//strip html-codes or entities
$text = strip_tags(strtr($text, array_flip(get_html_translation_table(HTML_ENTITIES))));
//count the words
$wordcount = preg_match_all("#(\w+)#", $text, $match_dummy );
?>

strcasecmp> <str_split
Last updated: Fri, 20 Jun 2008
 
 
show source | credits | sitemap | contact | advertising | mirror sites