Posts Tagged ‘qt’

* Calibre Week in Review

Posted on May 31st, 2009 by John. Filed under calibre.


Not much happened this week. A few bug fixes and a new output format, RTF. It produces acceptable results. It also embeds images into the file. The output could use some tweaking, but this will come with time. The only caveat is the output is ascii only. This is to keep compatibility with Cailbre’s RTF intput which can only accept ascii rtf files.

Pluginize has been merged back into trunk. Once a bit of testing is done by Kovid, he will be rolling out a beta for the 0.6 release. For those of you, like me, who use Ubuntu and build Calibre from source, there is a little change you will need to make in order to have it build. Open the file /usr/lib/python2.6/dist-packages/PyQt4/uic/Compiler/qtproxies.py and modify _qwidgets on line 238 to include “QWizardPage”.

Tags: , , , , .



* Building the eBook Tools

Posted on December 23rd, 2008 by John. Filed under programming.


It’s come to my attention that while I’ve posted a few eBook formating tools I wrote and use I never posted how to build them. Since I’m using Qt the easiest way to build them is to use qmake and make.

The build process is simple. Create a pro file for the project say fix_end_ebook_txt.pro. Run qmake then run make. You will end up with an executable. Just remember that this requires Qt, make, and a C++ complier (g++ on *nix or mingw on Windows).

fix_end_ebook_txt.pro

SOURCES += fix_end_ebook_txt.cpp
CONFIG += qt
TARGET = fix_end_ebook_txt

The above pro is very minimal and can be further tuned for the specific project but at the very least it shows how to build the Qt eBook tools I’ve posted.

Tags: , .



* eBook Adding Empty Lines At End of File

Posted on December 22nd, 2008 by John. Filed under programming.


Continuing my work to clean up my eBooks I’ve written another little tool to help. I like for my eBooks to have two blank lines at the end of the file.

The only major caveat of this one is it assumes Unix end of lines. Meaning a single \n character. In order for this to work correctly use of the dos2unix tool is necessary for files that use a different new line format.

fix_end_ebook_txt.cpp

/*
Copyright (c) 2008 John Schember 
 
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:
 
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
 
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*/
 
/*
Ensures that there are 3 newline characters at the end of the file (two blank
lines after the last of the text). This assumes Unix \n line characters. Please
use dos2unix before running to ensure that the end of line characters are
correct.
*/
 
#include <QFile>
#include <QString>
#include <QTextStream>
 
int main(int argc, char **argv)
{
    // Stream to write errors to the console.
    QTextStream errStream(stderr);
 
    // Store for the contents of the ebook.
    QString content;
 
    // We need an ebook file to work on.
    if (argc != 2) {
        errStream << QObject::tr("Error: No input file") << endl;
        return 1;
    }
 
    QFile ebook(argv[1]);
    if (!ebook.open(QIODevice::ReadWrite | QIODevice::Text)) {
        errStream << QObject::tr("Error: Could not open") << endl;
        return 1;
    }
 
    // We use a QTextStream to actually work on the file.
    QTextStream ioStream(&ebook);
 
    // We want to see what the last 3 characters are at the end of the file.
    ioStream.seek(ebook.size() - 3);
    content = ioStream.read(3);
 
    // Move to the end of the file because we want to add newlines (\n&#39;s) to
    // the end.
    ioStream.seek(ebook.size());
 
    // We want 3 newline (\n) characters at the end of the file. Add them until
    // they total 3.
    for (int i = 0; i < (3 - content.count("\n")); i++) {
        ioStream << "\n" << flush;
    }
 
    ebook.close();
 
    return 0;
}

Tags: , , .



* eBook Paragraph Formating

Posted on December 21st, 2008 by John. Filed under programming.


Today I wrote two simple programs to help me clean up my ebooks. I prefer to keep my ebook collection as plain text files with paragraphs separated by a blank line. The first program reflows the paragraphs to put each on a single line. The second removes extraneous whitespace from the file.

The reflow is the more intensive of the two. I ran it on the largest ebook I have, Project Gutenberg’s War and Peace by Leo Tolstoy. The file is 3.1 MB.

Time to run: 7m35.494s.
Memory usage: 13.1 MB according to gnome-system-monitor.

Right now I’m loading the entire book into memory and using QStrings to work on it. Memory usage is about 4.5 x the size of the book. Thankfully plain text ebooks are fairly small. Later I’m going to look into optimizing it for size and hopefully speed.

Without further ado here are the two. They are MIT licensed and use the Qt tool kit.

fix_paragraphs_ebook_txt.cpp

/*
Copyright (c) 2008 John Schember 
 
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:
 
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
 
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*/
 
/*
Reflows txt file ebook paragraphs. Paragraphs should be separated by a blank
line. Takes paragraphs that have hard breaks and puts all lines onto a single
line.
 
For Example:
 
INPUT
 
This is a multi line paragraph. It comprises
a few lines but has hard
breaks.
 
Now for the second
borken apart paragraph.
 
OUTPUT
 
This is a multi line paragraph. It comprises a few lines but has hard breaks.
 
Now for the second broken apart paragraph.
*/
 
#include <QFile>
#include <QRegExp>
#include <QString>
#include <QTextStream>
 
int main(int argc, char** argv)
{
    // Stream to write errors to the console.
    QTextStream errStream(stderr);
 
    // Regular expression to search for broken paragraphs. Works by looking
    // for char newline char. A proper ebook should have paragraphs separated
    // by a blank line meaning char newline newline char.
    QRegExp re("[^\n]\n[^\n]");
 
    // Store for the contents of the ebook.
    QString content;
 
    // We need an ebook file to work on.
    if (argc != 2) {
        errStream << QObject::tr("Error: No input file") << endl;
        return 1;
    }
 
    QFile ebook(argv[1]);
    if (!ebook.open(QIODevice::ReadWrite | QIODevice::Text)) {
        errStream << QObject::tr("Error: Could not open") << endl;
        return 1;
    }
 
    // We use a QTextStream to actually work on the file.
    QTextStream ioStream(&ebook);
    // Read the entire file contents into memory.
    content = ioStream.readAll();
 
    while (content.contains(re)) {
        // Remove the newline when there is a match with the regular expression.
        content = content.replace(content.indexOf(re)+1, 1, " ");
    }
 
    // Truncate the ebook so we don&#39;t end up with the original contents after
    // our modified contents.
    if (!ebook.resize(0)) {
        errStream << QObject::tr("Error: Could not truncate file") << endl;
        return 1;
    }
 
    // Store the modified content back on disk.
    ioStream.seek(0);
    ioStream << content;
 
    ebook.close();
 
    return 0;
}

remove_extra_whitespace_ebook_txt.cpp

/*
Copyright (c) 2008 John Schember 
 
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:
 
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
 
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*/
 
/*
Removes extraneous whitespace in a txt file ebook. This will remove every
'\t', '\v', '\f', '\r', and will replace multiple occurrences ' ' with a single
one.
 
For Example:
 
INPUT
 
      This     is a bad                          line.
 
Now for  the     second     borken line.
 
OUTPUT
 
This is a bad line.
 
Now for the second borken line.
 
*/
 
#include 
#include 
#include 
 
int main(int argc, char **argv)
{
    // Stream to write errors to the console.
    QTextStream errStream(stderr);
 
    // Store for the contents of the ebook.
    QString content;
 
    // We need an ebook file to work on.
    if (argc != 2) {
        errStream << QObject::tr("Error: No input file") << endl;
        return 1;
    }
 
    QFile ebook(argv[1]);
    if (!ebook.open(QIODevice::ReadWrite | QIODevice::Text)) {
        errStream << QObject::tr("Error: Could not open") << endl;
        return 1;
    }
 
    // We use a QTextStream to actually work on the file.
    QTextStream ioStream(&ebook);
 
    // Read every line and remove the extras we don&#39;t want.
    while (!ioStream.atEnd()) {
        content += ioStream.readLine().simplified() + "\n";
    }
 
    // Truncate the ebook so we don&#39;t end up with the original contents after
    // our modified contents.
    if (!ebook.resize(0)) {
        errStream << QObject::tr("Error: Could not truncate file") << endl;
        return 1;
    }
 
    // Store the modified content back on disk.
    ioStream.seek(0);
    ioStream << content;
 
    ebook.close();
 
    return 0;
}

Tags: , , .