<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Word Usage in SciFi Stories</title>
	<atom:link href="http://www.spacetimestories.com/commentary/word-usage-in-scifi-stories/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.spacetimestories.com/commentary/word-usage-in-scifi-stories/</link>
	<description>Space and Time Travel Stories.  A Science Fiction Blog By Sean O&#039;Brien</description>
	<lastBuildDate>Sun, 02 Oct 2011 15:26:34 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.5</generator>
	<item>
		<title>By: Ellen</title>
		<link>http://www.spacetimestories.com/commentary/word-usage-in-scifi-stories/comment-page-1/#comment-272</link>
		<dc:creator>Ellen</dc:creator>
		<pubDate>Sat, 20 Feb 2010 02:34:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.spacetimestories.com/?p=24#comment-272</guid>
		<description>My UNIX skills are so rusty as to have gaping corroded holes, but thanks to help pages  I successfully used your word counter tonight!  I am immensely grateful. 

I had been counting using BBEdit&#039;s &quot;find all&quot; command, which was helpful but assumed I knew which words I was overusing to begin with and gave me no printed report. I knew there had to be a more efficient way. This is magnificent! Thank you!</description>
		<content:encoded><![CDATA[<p>My UNIX skills are so rusty as to have gaping corroded holes, but thanks to help pages  I successfully used your word counter tonight!  I am immensely grateful. </p>
<p>I had been counting using BBEdit&#8217;s &#8220;find all&#8221; command, which was helpful but assumed I knew which words I was overusing to begin with and gave me no printed report. I knew there had to be a more efficient way. This is magnificent! Thank you!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sean</title>
		<link>http://www.spacetimestories.com/commentary/word-usage-in-scifi-stories/comment-page-1/#comment-182</link>
		<dc:creator>Sean</dc:creator>
		<pubDate>Tue, 21 Apr 2009 02:40:57 +0000</pubDate>
		<guid isPermaLink="false">http://www.spacetimestories.com/?p=24#comment-182</guid>
		<description>Phil,  outstanding contribution.  Thanks very much.</description>
		<content:encoded><![CDATA[<p>Phil,  outstanding contribution.  Thanks very much.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Phil Weaver</title>
		<link>http://www.spacetimestories.com/commentary/word-usage-in-scifi-stories/comment-page-1/#comment-181</link>
		<dc:creator>Phil Weaver</dc:creator>
		<pubDate>Mon, 20 Apr 2009 15:51:39 +0000</pubDate>
		<guid isPermaLink="false">http://www.spacetimestories.com/?p=24#comment-181</guid>
		<description>Thanks for the Perl script, Sean.  I had a similar idea for my own book, and was glad to use it.  I had a wider range of punctuation, though, and decided I wanted to convert everything to lower case before collecting the histogram.  Here&#039;s my modified version if you want it.

Regards,
-Phil

my %word_list; # this hash will have an entry for every distinct word
my $count = 0; # count of distinct words
my $total_words =0; # total # of words used
my $file = shift; # name of text file to parse

open FILE, &quot;&lt;$file&quot;;

while (){ #for each line in the file

	my @words = split; # split each line into separate words

	foreach my $word (@words) { # for each word on that line

		$word =~ s/[.]//g;  # strip any periods
		$word =~ s/”//g;  # strip any quotes
		$word =~ s/“//g;  # strip any quotes
		$word =~ s/&quot;//g;  # strip any quotes
		$word =~ s/,//g;  # strip any commas
		$word =~ s/’//g;  # strip any apostrophes
		$word =~ s/://g;  # strip any colons
		$word =~ s/&#039;//g;  # strip any single-quotes
		$word =~ s/;//g;  # strip any semicolons
		$word =~ s/!//g;  # strip any exclamation points
		$word =~ s/\?//g;  # strip any question marks
		
		$word = lc($word);
		# unless ($word =~ /ly/) {next;} # remove the first # sign if you want to look for adverbs

		if ($word_list{$word}) { # if the word has already been seen increment the count
			$word_list{$word}++;
		}
		else {
			$word_list{$word} = 1; # else it’s a new word, start with 1
		}
		$total_words++; # increment counter for total # of words
	}
}
print &quot;\nWord usage in $file\n\n\n&quot;;

foreach $key (sort sort_values (keys(%word_list))) { # get the keys, sort them by value, for each one
	print &quot;$word_list{$key} \t $key\n&quot;;
	$count ++; # increment counter for distinct # of words
}

print &quot;\n\nTotal of $total_words words, $count distinct words used\n&quot;;


sub sort_values { # sort a hash by value
$word_list{$a}  $word_list{$b};
}</description>
		<content:encoded><![CDATA[<p>Thanks for the Perl script, Sean.  I had a similar idea for my own book, and was glad to use it.  I had a wider range of punctuation, though, and decided I wanted to convert everything to lower case before collecting the histogram.  Here&#8217;s my modified version if you want it.</p>
<p>Regards,<br />
-Phil</p>
<p>my %word_list; # this hash will have an entry for every distinct word<br />
my $count = 0; # count of distinct words<br />
my $total_words =0; # total # of words used<br />
my $file = shift; # name of text file to parse</p>
<p>open FILE, &#8220;&lt;$file&#8221;;</p>
<p>while (){ #for each line in the file</p>
<p>	my @words = split; # split each line into separate words</p>
<p>	foreach my $word (@words) { # for each word on that line</p>
<p>		$word =~ s/[.]//g;  # strip any periods<br />
		$word =~ s/”//g;  # strip any quotes<br />
		$word =~ s/“//g;  # strip any quotes<br />
		$word =~ s/&#8221;//g;  # strip any quotes<br />
		$word =~ s/,//g;  # strip any commas<br />
		$word =~ s/’//g;  # strip any apostrophes<br />
		$word =~ s/://g;  # strip any colons<br />
		$word =~ s/&#8217;//g;  # strip any single-quotes<br />
		$word =~ s/;//g;  # strip any semicolons<br />
		$word =~ s/!//g;  # strip any exclamation points<br />
		$word =~ s/\?//g;  # strip any question marks</p>
<p>		$word = lc($word);<br />
		# unless ($word =~ /ly/) {next;} # remove the first # sign if you want to look for adverbs</p>
<p>		if ($word_list{$word}) { # if the word has already been seen increment the count<br />
			$word_list{$word}++;<br />
		}<br />
		else {<br />
			$word_list{$word} = 1; # else it’s a new word, start with 1<br />
		}<br />
		$total_words++; # increment counter for total # of words<br />
	}<br />
}<br />
print &#8220;\nWord usage in $file\n\n\n&#8221;;</p>
<p>foreach $key (sort sort_values (keys(%word_list))) { # get the keys, sort them by value, for each one<br />
	print &#8220;$word_list{$key} \t $key\n&#8221;;<br />
	$count ++; # increment counter for distinct # of words<br />
}</p>
<p>print &#8220;\n\nTotal of $total_words words, $count distinct words used\n&#8221;;</p>
<p>sub sort_values { # sort a hash by value<br />
$word_list{$a}  $word_list{$b};<br />
}</p>
]]></content:encoded>
	</item>
</channel>
</rss>

