Posts Tagged ‘python’

* Cybook t4b Format Specification

Posted on January 13th, 2010 by John. Filed under hardware.

The new epub thumbnail files (.epub.thn) are what Bookeen calls t4b files. They are very similar to the older t2b thumbnail files they were using in earlier versions of the Cybook firmware. As the name suggests instead of using 2 bits to represent color values 4 bits are now used. This increases the number of colors from 4 to 16. In addition to the increased color range the t4b files now require a header of “t4bp” without the quotes.

The image’s dimensions are 96×144. The bits representing 0, 1, 2, 3… are written directly to the file. it is very similar to a pgm file in this regard. Each 4 bit sequence represents a pixel color. Only black, white and shades of gray are supported.

Every t4b file will have 13,824 pixels. The file size will always be 6,916 bytes. The formula to determine this is: (height x width x 2 bits per pixel) / 8 bits per byte. ((96 * 144 * 4) / 8 ) + 4 = 6,916. The + 4 is the header.

Following are two python scripts for converting an image to a t4b file and for converting a t4b file into a pgm image.

#!/usr/bin/env python

import sys, Image

REDUCE_MARKS = [16, 32, 48, 64, 80, 96, 112, 128, 144, 160, 176, 192, 208, 224, 240]

def reduce_color(c):
    val = 0
    for mark in REDUCE_MARKS:
        if c > mark:
            val += 1
    return val

def main():
    if len(sys.argv) != 3:
        raise Exception('Must have 2 arguments. %s input.image output.epub.thn' % sys.argv[0])

    outf = open(sys.argv[2], 'wb')

    im =[1]).convert("L")
    im.thumbnail((96, 144))

    newim ='L', (96, 144), 'white')

    x,y = im.size
    newim.paste(im, ((96-x)/2, (144-y)/2))


    px = []
    for p in newim.getdata():
        if len(px) == 2:
            byte_val = bin(reduce_color(px[0]))[2:].zfill(4) + bin(reduce_color(px[1]))[2:].zfill(4)
            outf.write(chr(int(byte_val, 2)))
            px = []
        elif len(px) > 2:
            raise Exception('Fatal error px length increased past 2.')


if __name__ == '__main__':

#!/usr/bin/env python

import sys, os

def get_greys(b):
    if not b:
        return 0, 0

    b = bin(int(ord(b)))
    b = b[2:].zfill(8)

    w = str(int(b[0:4], 2))
    x = str(int(b[4:8], 2))

    return w, x

def main():
    if len(sys.argv) != 3:
        raise Exception('Must have 2 arguments. %s input.epub.thm output.pgm' % sys.argv[0])

    t4bfile = open(sys.argv[1], 'rb')
    pgmfile = open(sys.argv[2], 'w')

    pgmfile.write('P2\n96 144\n15\n')

    # Read past the t4b header

    for i in range(144):
        for j in range(48):
            b =

            vals = get_greys(b)
            pgmfile.write('%s %s ' % (vals[0], vals[1]))


if __name__ == '__main__':

Tags: , , , , , , .

    Comments Off

* Unidecoder

Posted on October 31st, 2009 by John. Filed under programming.

A while back I made a post about ASCIIizing Text. With it was a simple python application that would convert Unicode characters to ASCII equivalents. It doesn’t do a basic conversion but also Latinizes the characters when they are outside of the ASCII range.

The uni2ascii package I made has a few short comings I’ve decided to fix. The three major problems with it are: 1) Very basic permission checking, 2) Only accepts one file, 3) Required all input to be UTF8 encoded, 4) The decoder was a very literal port of a the ruby version.

To fix these issues I’ve written an entirely new script. Problems 1, 2 and 3 are fixed. It has robust error checking, can handle an arbitrary number of files, and the file encoding can be specified. Number 4 is fixed by using the Python port created by Tomaz Solc.

I’ve put the source code for the new decoder into a Launchpad branch:

$ bzr branch lp:~user-none/+junk/unidecoder

Tags: , , .

    Comments Off

* Niw Markdown Editor

Posted on August 30th, 2009 by John. Filed under niwmarkdowneditor.

For the past three weeks I’ve been working on an editor for working with plain text files and making it easy to add markdown syntax to them. My main goal is to make it easier to format the large number of ebooks I have. Almost all of them are plain text files.

It’s a python project using PyQt4 and I’m hosting it on Launchpad. here is the project page and you can find some screen shots here.

The features of this application and what makes it more useful that a generic text editor are the tool box and the tools. The toolbox allows for a number of markdown syntax changes to be made with one click. The tools menu supports a number of options that make formatting text a bit easier.

The current tools are:

  • Heading list which shows a listing of all headings in the document
  • Link list which shows a listing of all links in the document
  • Image list which shows a listing of all images in the document
  • ASCIIize which will turn all unicode characters into an ASCII equivalent
  • Remove leading spaces
  • Remove trailing spaces
  • Replace tabs with spaces
  • Separate paragraphs
  • Double line breaks
  • Remove excessive line beaks

There are a number of other options such as line numbering, highlighting of the line and syntax, and inline spell check.

There is still a lot I would like to do with the project. For one thing I needs and icon. As well as build targets for Windows and OS X. Include image previews in the Image listing. Take a look at the TODO file to get a feel of what I have in mind in the near future.

For those of you who what to test it out you can find a tarball here. The dependencies are:

  • Python 2.6
  • Qt >= 4.5
  • PyQt >=4.5
  • python-markdown
  • python-enchant (optional for spell check)

Tags: , , , .

* QPlainTextEdit With In Line Spell Check

Posted on August 22nd, 2009 by John. Filed under programming.

***Update: Simplified Highlighter.highlightBlock function

One thing Qt lacks is an integrated spell check in the text entry components. For a project I’m working on this is necessary. Using python-enchant and the QSyntaxHighlighter I was able to implement this functionality. Here is how to add an in line spell check support to a QPlainTextEdit.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

__license__ = 'MIT'
__copyright__ = '2009, John Schember '
__docformat__ = 'restructuredtext en'

import re
import sys

import enchant

from PyQt4.Qt import QAction
from PyQt4.Qt import QApplication
from PyQt4.Qt import QEvent
from PyQt4.Qt import QMenu
from PyQt4.Qt import QMouseEvent
from PyQt4.Qt import QPlainTextEdit
from PyQt4.Qt import QSyntaxHighlighter
from PyQt4.Qt import QTextCharFormat
from PyQt4.Qt import QTextCursor
from PyQt4.Qt import Qt
from PyQt4.QtCore import pyqtSignal

class SpellTextEdit(QPlainTextEdit):

    def __init__(self, *args):
        QPlainTextEdit.__init__(self, *args)

        # Default dictionary based on the current locale.
        self.dict = enchant.Dict()
        self.highlighter = Highlighter(self.document())

    def mousePressEvent(self, event):
        if event.button() == Qt.RightButton:
            # Rewrite the mouse event to a left button event so the cursor is
            # moved to the location of the pointer.
            event = QMouseEvent(QEvent.MouseButtonPress, event.pos(),
                Qt.LeftButton, Qt.LeftButton, Qt.NoModifier)
        QPlainTextEdit.mousePressEvent(self, event)

    def contextMenuEvent(self, event):
        popup_menu = self.createStandardContextMenu()

        # Select the word under the cursor.
        cursor = self.textCursor()

        # Check if the selected word is misspelled and offer spelling
        # suggestions if it is.
        if self.textCursor().hasSelection():
            text = unicode(self.textCursor().selectedText())
            if not self.dict.check(text):
                spell_menu = QMenu('Spelling Suggestions')
                for word in self.dict.suggest(text):
                    action = SpellAction(word, spell_menu)
                # Only add the spelling suggests to the menu if there are
                # suggestions.
                if len(spell_menu.actions()) != 0:
                    popup_menu.insertMenu(popup_menu.actions()[0], spell_menu)


    def correctWord(self, word):
        Replaces the selected text with word.
        cursor = self.textCursor()



class Highlighter(QSyntaxHighlighter):

    WORDS = u'(?iu)[\w\']+'

    def __init__(self, *args):
        QSyntaxHighlighter.__init__(self, *args)

        self.dict = None

    def setDict(self, dict):
        self.dict = dict

    def highlightBlock(self, text):
        if not self.dict:

        text = unicode(text)

        format = QTextCharFormat()

        for word_object in re.finditer(self.WORDS, text):
            if not self.dict.check(
                    word_object.end() - word_object.start(), format)

class SpellAction(QAction):

    A special QAction that returns the text in a signal.

    correct = pyqtSignal(unicode)

    def __init__(self, *args):
        QAction.__init__(self, *args)

        self.triggered.connect(lambda x: self.correct.emit(

def main(args=sys.argv):
    app = QApplication(args)

    spellEdit = SpellTextEdit()

    return app.exec_()

if __name__ == '__main__':

The SpellTextEdit’s purpose is straightforward. It will mark misspelled words. Right clicking on a word in the SpellTextEdit will cause the word to become selected and display a context menu. If the word is misspelled and there are spelling suggestions the context menu will include a sub menu of those suggestions. Selecting a suggestion will replace the misspelled text with the selection.

The Highlighter class takes text, breaks it into words, checks if they are spelled correctly and if not underlines the misspelled ones with a red squiggle. I’m using a regular expression to split the words instead of using str.split because str.split will only split on whitespace and include punctuation (e.g. “.!*) as part of the words.

SpellAction is a simple class that allows for the action’s text to be sent with the signal. This is necessary for dynamically creating the list of possible correction words in the right click menu. The SpellAction is connected to a function that replaces the selected text with the signal text.

Tags: , , , , , .

* Better QPlainTextEdit With Line Numbers

Posted on August 19th, 2009 by John. Filed under programming.

My last post was an implementation of a Qt widget which displays text with line numbers. I found that it has a few limitations. The biggest was a performance penalty when dealing with large documents. I’ve since re-factored and rewritten the class to make the performance acceptable. I’ve also cleaned up the code a bit and added a highlight to the current line.

Text widget with support for line numbers

from PyQt4.Qt import QFrame
from PyQt4.Qt import QHBoxLayout
from PyQt4.Qt import QPainter
from PyQt4.Qt import QPlainTextEdit
from PyQt4.Qt import QRect
from PyQt4.Qt import QTextEdit
from PyQt4.Qt import QTextFormat
from PyQt4.Qt import QVariant
from PyQt4.Qt import QWidget
from PyQt4.Qt import Qt

class LNTextEdit(QFrame):

    class NumberBar(QWidget):

        def __init__(self, edit):
            QWidget.__init__(self, edit)

            self.edit = edit

        def paintEvent(self, event):
            self.edit.numberbarPaint(self, event)
            QWidget.paintEvent(self, event)

        def adjustWidth(self, count):
            width = self.fontMetrics().width(unicode(count))
            if self.width() != width:

        def updateContents(self, rect, scroll):
            if scroll:
                self.scroll(0, scroll)
                # It would be nice to do
                # self.update(0, rect.y(), self.width(), rect.height())
                # But we can't because it will not remove the bold on the
                # current line if word wrap is enabled and a new block is
                # selected.

    class PlainTextEdit(QPlainTextEdit):

        def __init__(self, *args):
            QPlainTextEdit.__init__(self, *args)




        def highlight(self):
            hi_selection = QTextEdit.ExtraSelection()

            hi_selection.format.setProperty(QTextFormat.FullWidthSelection, QVariant(True))
            hi_selection.cursor = self.textCursor()


        def numberbarPaint(self, number_bar, event):
            font_metrics = self.fontMetrics()
            current_line = self.document().findBlock(self.textCursor().position()).blockNumber() + 1

            block = self.firstVisibleBlock()
            line_count = block.blockNumber()
            painter = QPainter(number_bar)
            painter.fillRect(event.rect(), self.palette().base())

            # Iterate over all visible text blocks in the document.
            while block.isValid():
                line_count += 1
                block_top = self.blockBoundingGeometry(block).translated(self.contentOffset()).top()

                # Check if the position of the block is out side of the visible
                # area.
                if not block.isVisible() or block_top >= event.rect().bottom():

                # We want the line number for the selected line to be bold.
                if line_count == current_line:
                    font = painter.font()
                    font = painter.font()

                # Draw the line number right justified at the position of the line.
                paint_rect = QRect(0, block_top, number_bar.width(), font_metrics.height())
                painter.drawText(paint_rect, Qt.AlignRight, unicode(line_count))

                block =


    def __init__(self, *args):
        QFrame.__init__(self, *args)

        self.setFrameStyle(QFrame.StyledPanel | QFrame.Sunken)

        self.edit = self.PlainTextEdit()
        self.number_bar = self.NumberBar(self.edit)

        hbox = QHBoxLayout(self)


    def getText(self):
        return unicode(self.edit.toPlainText())

    def setText(self, text):

    def isModified(self):
        return self.edit.document().isModified()

    def setModified(self, modified):

    def setLineWrapMode(self, mode):

Tags: , , .

* QTextEdit With Line Numbers

Posted on August 15th, 2009 by John. Filed under programming.

Here is a Qt4 widget written in Python that allows for line numbers next to a QTextEdit. Similar to what is seen in a number of text editors such as gedit and kate.

from PyQt4.Qt import QFrame, QWidget, QTextEdit, QHBoxLayout, QPainter

class LineTextWidget(QFrame):

    class NumberBar(QWidget):

        def __init__(self, *args):
            QWidget.__init__(self, *args)
            self.edit = None
            # This is used to update the width of the control.
            # It is the highest line that is currently visibile.
            self.highest_line = 0

        def setTextEdit(self, edit):
            self.edit = edit

        def update(self, *args):
            Updates the number bar to display the current set of numbers.
            Also, adjusts the width of the number bar if necessary.
            # The + 4 is used to compensate for the current line being bold.
            width = self.fontMetrics().width(str(self.highest_line)) + 4
            if self.width() != width:
            QWidget.update(self, *args)

        def paintEvent(self, event):
            contents_y = self.edit.verticalScrollBar().value()
            page_bottom = contents_y + self.edit.viewport().height()
            font_metrics = self.fontMetrics()
            current_block = self.edit.document().findBlock(self.edit.textCursor().position())

            painter = QPainter(self)

            line_count = 0
            # Iterate over all text blocks in the document.
            block = self.edit.document().begin()
            while block.isValid():
                line_count += 1

                # The top left position of the block in the document
                position = self.edit.document().documentLayout().blockBoundingRect(block).topLeft()

                # Check if the position of the block is out side of the visible
                # area.
                if position.y() > page_bottom:

                # We want the line number for the selected line to be bold.
                bold = False
                if block == current_block:
                    bold = True
                    font = painter.font()

                # Draw the line number right justified at the y position of the
                # line. 3 is a magic padding number. drawText(x, y, text).
                painter.drawText(self.width() - font_metrics.width(str(line_count)) - 3, round(position.y()) - contents_y + font_metrics.ascent(), str(line_count))

                # Remove the bold style if it was set previously.
                if bold:
                    font = painter.font()

                block =

            self.highest_line = line_count

            QWidget.paintEvent(self, event)

    def __init__(self, *args):
        QFrame.__init__(self, *args)

        self.setFrameStyle(QFrame.StyledPanel | QFrame.Sunken)

        self.edit = QTextEdit()

        self.number_bar = self.NumberBar()

        hbox = QHBoxLayout(self)


    def eventFilter(self, object, event):
        # Update the line numbers for all events on the text edit and the viewport.
        # This is easier than connecting all necessary singals.
        if object in (self.edit, self.edit.viewport()):
            return False
        return QFrame.eventFilter(object, event)

    def getTextEdit(self):
        return self.edit

Tags: , , .

* ASCIIize Text

Posted on July 24th, 2009 by John. Filed under programming.

One pet peeve of I have with my Cybook Gen 3 is its inability to properly display unicode characters in plain text files. I don’t need anything fancy like Japanese characters just simple things like “ and ” (as opposed to ” and “). To solve this problem I’ve been thinking about adding an –asciize option to calibre. I say thinking because I didn’t really know where to start. Thankfully a user recently requested this very functionality in bug #2846. He even included a link to work to accomplish this very task.

I will be integrating transliteration of unicode to ascii into calibre soon. However, in the mean time here is a script and classes, see unidecoder for a better method, to accomplish this task outside of calibre. This is my python port of the ruby unidecode gem. Which is a port of the original perl Text::Unidecode.

The major differences between my implementation and the others is it’s written in python and it uses a single dictionary instead of loading the code group files as needed.

You can find out more on how this all works at

Tags: , .

    Comments Off

* History Drop Down With Model

Posted on July 23rd, 2009 by John. Filed under programming.

Following is a bit of python code that illustrates how to create a QComboBox that attaches to a model for listing history items. The main features of this code are items entered in the text area of the combo are added to the history. Selected items and items entered that already appear in the combo are moved to the top. When MAX_ITEMS is exceeded older items (items at the bottom of the drop down) are removed.

#!/usr/bin/env python

from PyQt4.Qt import *
from PyQt4.QtGui import *

class ComboModel(QAbstractListModel):

    MAX_ITEMS = 5
    items = [u'123', u'456', u'789']

    # Required to get a working model.
    def rowCount(self, parent=QModelIndex()):
        # This is a List model meaning all elements are root elements.
        if parent and parent.isValid():
            return 0
        return len(self.items)

    # Required to get a working model.
    def insertRows(self, row, count, parent=QModelIndex()):
        self.beginInsertRows(parent, 0, 1)
        return True

    def data(self, index, role):
        if role in (Qt.DisplayRole, Qt.EditRole):
            return QVariant(self.items[index.row()])
        return QVariant()

    def setData(self, index, value, role):
        value = unicode(value.toString())
        if value in self.items:
            # Move the item to the top of the list.
            del self.items[self.items.index(value)]
            self.items.insert(0, value)
            # Add the new item to the top of the list.
            self.items.insert(0, value)
        self.emit(SIGNAL('dataChanged(QModelIndex, QModelIndex)'),
            self.createIndex(0, 0), self.createIndex(len(self.items) - 1, 0))
        return True

    def remove_items(self):
        Checks the number of items in the list and if it has been exceeded
        removes extra items.
        if len(self.items) > self.MAX_ITEMS:
            count = len(self.items) - self.MAX_ITEMS
            del self.items[-count:]

    def order_items(self, index):
        Move the selected item tot he top of the list.
        # We only need to move the item to the top if it is not the first item.
        if index > 0:
            value = self.items[index]
            del self.items[index]
            self.items.insert(0, value)

class MComboBox(QComboBox):

    def __init__(self, parent=None):
        QComboBox.__init__(self, parent)
        # The default policy is InsertAtBottom. InsertAtTop must be set
        # otherwise the model will move the item to the top and the QComboBox
        # will select the last index. In the case of reaching max items the
        # index will be invalid because the removal is after the QComboBox has
        # stored the value of the last index.
        # Without this the QComboBox will not call set data for us to check
        # for items already in the model if the item is a duplicate.
        self._model = ComboModel()

        # We need to tell the model when an item is selected so it can be moved
        # to the top of the history list.
        self.connect(self, SIGNAL('activated(int)'), self._model.order_items)

def main():
    app = QApplication([])

    bx = MComboBox()


if __name__ == '__main__':

One thing to note is that in this example the model stores the items in a list called items. This can be replaced with some other way to retrieve the history items. For example with a connection to an SQLite DB.

Tags: , .

    Comments Off

* QCompleter and Comma-Separated Tags

Posted on July 4th, 2009 by John. Filed under programming.

Here is a python script demonstrating how to use QCompleter to complete multiple tags in a QLineEdit. A few features of this script are: removing tags from the drop down that already appear in the QLineEdit, caching the tags, and inserting a , after completion to ease adding more tags. There are a few parts of this script that I’m going to go into detail about.

#!/usr/bin/env python

Copyright (c) 2009 John Schember 

Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.


import sys

from PyQt4.Qt import Qt, QObject, QApplication, QLineEdit, QCompleter, \
    QStringListModel, SIGNAL

TAGS = ['Nature', 'buildings', 'home', 'City', 'country', 'Berlin']

class CompleterLineEdit(QLineEdit):

    def __init__(self, *args):
        QLineEdit.__init__(self, *args)

        QObject.connect(self, SIGNAL('textChanged(QString)'), self.text_changed)

    def text_changed(self, text):
        all_text = unicode(text)
        text = all_text[:self.cursorPosition()]
        prefix = text.split(',')[-1].strip()

        text_tags = []
        for t in all_text.split(','):
            t1 = unicode(t).strip()
            if t1 != '':
        text_tags = list(set(text_tags))

        self.emit(SIGNAL('text_changed(PyQt_PyObject, PyQt_PyObject)'),
            text_tags, prefix)

    def complete_text(self, text):
        cursor_pos = self.cursorPosition()
        before_text = unicode(self.text())[:cursor_pos]
        after_text = unicode(self.text())[cursor_pos:]
        prefix_len = len(before_text.split(',')[-1].strip())
        self.setText('%s%s, %s' % (before_text[:cursor_pos - prefix_len], text,
        self.setCursorPosition(cursor_pos - prefix_len + len(text) + 2)

class TagsCompleter(QCompleter):

    def __init__(self, parent, all_tags):
        QCompleter.__init__(self, all_tags, parent)
        self.all_tags = set(all_tags)

    def update(self, text_tags, completion_prefix):
        tags = list(self.all_tags.difference(text_tags))
        model = QStringListModel(tags, self)

        if completion_prefix.strip() != '':

def main():
    app = QApplication(sys.argv)

    editor = CompleterLineEdit()

    completer = TagsCompleter(editor, TAGS)

        SIGNAL('text_changed(PyQt_PyObject, PyQt_PyObject)'),
    QObject.connect(completer, SIGNAL('activated(QString)'),


    return app.exec_()

if __name__ == '__main__':

Looking at the main() function the editor widget’s text_changed signal is connected to the completer’s update slot. This serves two purposes. It provides the completer with all tags that are in the editor. Also, it provides the completer with the prefix of the current text that is being entered. The prefix is used for listing matching tags that are stored in the completer’s cache.

editor’s complete_text function takes the text in the editor before the cursor and subtracts the length of the prefix from that position because the prefix will be included in the completed text. The text before, the completed text, a comma, a space, and the text after the cursor are combined. This becomes the text in the QLineEdit. The cursor is advanced the position it was at minus the length of the prefix and plus 2 characters (, ) so that typing can immediately continue.

TAGS can be replaced with a function that gets all relevant tags if caching is not wanted.

Also, note that completer.setWidget(editor) was used not QLineEdit’s setCompleter() function. If setCompleter is used completion will only take place at the beginning of the QLineEdit. Meaning it will match everything before as one string and ignore the , delimiter separating the tags.

Tags: , .

* Calibre Week in Review

Posted on May 31st, 2009 by John. Filed under calibre.

Not much happened this week. A few bug fixes and a new output format, RTF. It produces acceptable results. It also embeds images into the file. The output could use some tweaking, but this will come with time. The only caveat is the output is ascii only. This is to keep compatibility with Cailbre’s RTF intput which can only accept ascii rtf files.

Pluginize has been merged back into trunk. Once a bit of testing is done by Kovid, he will be rolling out a beta for the 0.6 release. For those of you, like me, who use Ubuntu and build Calibre from source, there is a little change you will need to make in order to have it build. Open the file /usr/lib/python2.6/dist-packages/PyQt4/uic/Compiler/ and modify _qwidgets on line 238 to include “QWizardPage”.

Tags: , , , , .

    Comments Off