<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>John&#039;s Blog &#187; regex</title>
	<atom:link href="http://john.nachtimwald.com/tag/regex/feed/" rel="self" type="application/rss+xml" />
	<link>http://john.nachtimwald.com</link>
	<description>My little blog</description>
	<lastBuildDate>Sat, 31 Jul 2010 17:28:10 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Changing Single Quotation Marks to Double in eBooks</title>
		<link>http://john.nachtimwald.com/2009/05/12/changing-single-quotation-marks-to-double-in-ebooks/</link>
		<comments>http://john.nachtimwald.com/2009/05/12/changing-single-quotation-marks-to-double-in-ebooks/#comments</comments>
		<pubDate>Tue, 12 May 2009 20:28:22 +0000</pubDate>
		<dc:creator>John</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[ebook]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[regex]]></category>

		<guid isPermaLink="false">http://john.nachtimwald.com/?p=116</guid>
		<description><![CDATA[As a person living in the USA I highly prefer double quotation marks to single quotes when denoting speech. Some authors use the single quote for effect but mostly it&#8217;s just a style choice. I find UK authors generally use the two interchangeably. Tolkien books are a good example. I have The Hobbit, The Lord [...]]]></description>
			<content:encoded><![CDATA[<p>As a person living in the USA I highly prefer double quotation marks to single quotes when denoting speech. Some authors use the single quote for effect but mostly it&#8217;s just a style choice. I find UK authors generally use the two interchangeably. Tolkien books are a good example. I have The Hobbit, The Lord of the Rings, and The Children of Hurin. The Hobbit use double quotes while the other two uses single quotes.</p>
<p>Following is some simple python code that will take the book (named th.txt) and change the single quotes into double quotes for the books in question. Both use ’ and ’ for the opening and closing quotes. Also ’ is used for contractions. The regexes take the opening and closing characters into account as well as change the contractions to the non-unicode &#8216; character.</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> th = <span style="color: #008000;">open</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'th.txt'</span>, <span style="color: #483d8b;">'rb+wb'</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> th_t = th.<span style="color: black;">read</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> th_t = <span style="color: #dc143c;">re</span>.<span style="color: black;">sub</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'(?u)(?&gt;&gt; th_t = re.sub('</span><span style="color: black;">&#40;</span><span style="color: #66cc66;">?</span>u<span style="color: black;">&#41;</span>‘<span style="color: #483d8b;">', '</span><span style="color: #483d8b;">&quot;', th_t)
&gt;&gt;&gt; th_t = re.sub('(?u)’', '&quot;</span><span style="color: #483d8b;">', th_t)
&gt;&gt;&gt; th.seek(0)
&gt;&gt;&gt; th.truncate(0)
&gt;&gt;&gt; th.write(th_t)</span></pre></div></div>

<p>I do realize that the listed regexes could be combined a bit especially the opening and closing quotes. However, that would reduce their readability.</p>
]]></content:encoded>
			<wfw:commentRss>http://john.nachtimwald.com/2009/05/12/changing-single-quotation-marks-to-double-in-ebooks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
