I'm trying to split the string into an array of words and check them against the unix /usr/dict/words dictionary. If a match is found for the word it gets lcfirst($word)
if no match then ucfirst( $word )
辞書はfgetcsvを使って開かれ、配列に入れられます(私はfgetsを使い、行末に爆発的に試みました)。
function wnd_title_case( $string ) {
$file = fopen( "/users/chris/sites/wp-dev/trunk/core/words.txt", "rb" );
while ( !feof( $file ) ) {
$line_of_text = fgetcsv( $file );
$exceptions = array( $line_of_text );
}
fclose( $file );
$delimiters = array(" ", "-", "O'");
foreach ( $delimiters as $delimiter ) {
$words = explode( $delimiter, $string );
$newwords = array();
foreach ($words as $word) {
if ( in_array( strtoupper( $word ), $exceptions ) ) {
//check exceptions list for any words that should be lower case
$word = lcfirst( $word );
} elseif ( !in_array( $word, $exceptions ) ) {
//everything else capitalized
$word = ucfirst( $word );
}
array_push( $newwords, $word );
}
$string = join( $delimiter, $newwords );
}
$string = ucfirst( $string );
return $string;
}
ファイルが開かれたことを確認しました。
The desired output: Sentence case title string with proper nouns capitalized.
The current output: Title string with every word capitalized
以下のJayの答えを使用して、私は実行可能なソリューションを考え出しました。私の最初の問題は、私の単語辞書には、大文字と小文字の両方の単語が含まれていたので、正規表現コールバックの使用をチェックするための適切な名前辞書が見つかりました。それは完璧ではありませんが、ほとんどの場合それを正しく取得します。
function title_case( $string ) {
$fp = @fopen( THEME_DIR. "/_/inc/propernames", "r" );
$exceptions = array();
if ( $fp ) {
while( !feof($fp) ) {
$buffer = fgets( $fp );
array_push( $exceptions, trim($buffer) );
}
}
fclose( $fp );
$content = strtolower( $string );
$pattern = '~\b' . implode ( '|', $exceptions ) . '\b~i';
$content = preg_replace_callback ( $pattern, 'regex_callback', $content );
$new_content = $content;
return ucfirst( $new_content );
}
function regex_callback ( $data ) {
if ( strlen( $data[0] ) > 3 )
return ucfirst( strtolower( $data[0] ));
else return ( $data[0] );
}
正規表現でこれを行う最も簡単な方法は、以下を行うことです
$content = ucwords($original_content);
|
, and surrounding it with boundary markers and delimiters followed by the case insensitive flag, so you would end up with ~\bword1|word2|word3\b~i
(obviously with your large list)作業デモの例は次のとおりです。
function regex_callback($data) {
return strtolower($data[0]);
}
$original_content = 'hello my name is jay gilford';
$words = array('hello', 'my', 'name', 'is');
$content = ucwords($original_content);
$pattern = '~\b' . implode('|', $words) . '\b~i';
$content = preg_replace_callback($pattern, 'regex_callback', $content);
echo $content;
You could also optionally use strtolower to begin with on the content for consistency. The above code outputs hello my name is Jay Gilford