Archive for the ‘programming’ Category

* Calibre Week in Review

Posted on July 2nd, 2011 by John. Filed under calibre.


This week saw some more work on Get Books. Once change is not user visible but makes it easier for new stores to be added. The other change is user visible and was suggested by a user.

I’ve added a base class for OPDS OpenSearch based stores. It will use the OpenSearch url and retrieve all results. It then puts them into a SearchResult object for use with Get Books. individual stores will need to do some massaging of the results because I’ve found that the results are often inconsistent from store to store. However, this class does the bulk of the work as far as searching and retrieving results goes.

The other change is the ability for results to be directly downloaded within the Get Books search result dialog. Often stores with OPDS access will have direct download links as part of an entry. Now stores can collect these links and pass it along to the search dialog. When present the search dialog will show a green down arrow signifying that the item can be downloaded. Clicking the item will show a dialog asking which format (usually there are many) to download. If you don’t want to download you can right click and choose the option to go to the result in the store itself. This change is to make it even easier for people to get books into calibre. Only a few stores support this due to the nature of the stores themselves but I hope to see more in the future..

Tags: , , , .

    Comments Off


* Calibre “Get book” Ideas

Posted on June 1st, 2011 by John. Filed under calibre.


This is a list of ideas I’ve had or suggestions I’ve received from people about what to do with “Get books” in calibre to make it better. These are all ideas and may or may not make it into future release.

OPDS set of classes that makes it very easy and quick to add new stores that support OPDS. Basically, the plugin author would only need to specify the base feed url and this class would take care of finding the search feed. It would then allow for returning search results via the feed’s search. So all a plugin author would need to do is something like:

class NewStore(OPDSStore):
 
    FEED = 'http://store.tld/opds/feed.xml'

Going further with this idea would be to allow users to specify OPDS feeds in a list. Possibly with some meta information about the store. Then each store would be treated just like a store plugin. This would require some work to accomplish. The current framework for “Get books” assumes each store is a separate plugin.

Many free stores (especially ones with OPDS feeds) allow direct downloads of the ebook. It would be very nice if when double clicking a search result, instead of going to the store’s web page for the book, it would be downloaded right then and put into the users calibre library. This would reduce the step of downloading then adding with free sources. This would reduce the need to use the internal browser. This would need some thought on how to signify if a search result downloads a book or goes to a more information page. This would also need to be configurable and there would need to be a way to always access to the more info page. Further many stores provide more than one format so there would need to be some way for users to select or specify which format they want to download.

Again these are just some ideas of where I could take “Get books”.

Tags: , .



* KDocker Ubuntu PPA Updated

Posted on May 30th, 2011 by John. Filed under KDocker.


I’ve updated the Ubuntu PPA for KDocker. It now includes the 4.6 release for Lucid, Maveric and Natty. Amd64 and i386 architectures have packages ready to go.

Tags: , , .

    Comments Off


* KDocker 4.6 Released

Posted on May 29th, 2011 by John. Filed under KDocker.


Version 4.6 is now available for direct download.

This release focus on fixing two bugs when using KDocker with the KWin window manager. It fixes and issue with windows not restoring when the “iconify when minimized” option is set. There is also a small change in how windows are focused after being restored as KWin often didn’t focus restored windows.

Tags: , , .

    Comments Off


* Calibre Weeks in Review

Posted on May 28th, 2011 by John. Filed under calibre.


Get books has been out for a bit now and it’s a huge success. It’s gone over better than I could have hoped. My focus for the past few weeks has still been on Get books. Mainly adding more store and making it easier for users. Search settings and store chooser that allows users to disable stores they don’t care about have been added in the 0.8.3 release. Adding more polish to Get books with a focus the chooser are my focus for this week.

Tags: , , .

    Comments Off


* Calibre Weeks in Review

Posted on April 24th, 2011 by John. Filed under calibre.


Once again this is a weeks in review instead of once. I’ve been focusing more on new features than blogging. Calibre 0.7.57 is out and is also the beta for 0.8. Adding the tweak “test_eight_code = True” will enable the 0.8 features.

Get Books aka Stores

For quite some time I’ve been working on integrating support for searching and connecting to third party stores to make it easier for users to find and acquire books they’re interested in. This is a very large feature and one I’m very excited about. There are two pieces: The individual store plugins that connect the user to a given store and a meta-store search that searches all of the store plugins at once.

For 0.8 I have support for 14 stores. They are a mix of big name, independent, paid, free, and public domain. There is something for everyone. The majority of the stores are implemented though an embedded web browser. This is because the majority of stores are only accessible via their web site. MobileRead is the one exception but I’ll talk about that later.

By default accessing the stores is done though the embedded web browser but each store can be configured to open in the system web browser instead. One major befit of this approach is I’m able to detect ebook downloads. When an ebook is downloaded it is automatically added to the currently open library.

MobileRead is the exception and opens in it’s own search window. Right now it opens to the specific book’s entry in the embedded web browser so you can see details and download the book.

The meta-store search (along with MobileRead’s search dialog) allow for full boolean and field logic. Just like the main calibre window. The search gets results from every store and shows them in one easy to sort list. Title, Author, Price, DRM status, Store, and Formats are all listed.

PDB – Plucker Input

Not much to say about this but it’s been a long time coming. Plucker is now supported as an input format. Not all features are supported (tables for instance). However, plucker files for pretty much every source will have the main content come though.

Tags: , , , , .

    Comments Off


* Calibre Weeks in Review

Posted on April 9th, 2011 by John. Filed under calibre.


It has been a few weeks since I’ve done a calibre week in review. This is partly because I had been working on some new features for the upcoming 0.8 release. I haven’t wanted to talk about it very much until the release gets closer. Kovid said yesterday that he will be reviewing my changes next week.

HTMLZ

One complaint I hear often is in regard to the inability to edit ebooks. Many people seem to think EPUB is not a good format for editing. Sigil is often the solution given around these parts but some people insist on the need for a book to be contained in a single HTML file. Simply unzipping an EPUB doesn’t accomplish this due to the need to split the files.

To remedy this situation I’ve added a new output format: HTMLZ. Just like TXTZ it is just a zip file with with a different extension to differentiate it. Inside is a metadata.opf file (calibre can read and write metadata to it). Images are preserved, renamed and placed in an images folder. This format is available in the 0.7.54 release.

Also inside is a single HTML file. Even if you’re converting from and EPUB that has been split into multiple parts a conversion to HTMLZ will result in a single HTML file. To go along with this there are a number of ways to configure CSS handling. The default is to place the CSS in separate style.css file. It can also place class based CSS inside of the head element in the HTML itself. Or you can have it write the CSS inline within each element. Finally the last option for CSS is to remove it and convert as much as possible (a very limited set right now) to HTML tags.

As with all of my output format attempts I believe this will have quite a few bugs. Let me know about any issues so I can fix them. I hope people find this useful for their hand editing needs.

FB2 Output

Just a small change to FB2 output this time. Users can now select the genre for the output document. The default is antique but a list of supported genres is available to choose from.

GUI – Toolbars

theducks on MobileRead made a few requests regarding handling of toolbars. He was having trouble with the number of interface action plugins he had added to the toolbar and needed more space.

The first change is removing the split toolbar into two option and make the second toolbar user configurable. This way you can add what ever you want in the order you want to the second toolbar.

Along with this, thducks also wanted to be able to remove the icons on the toolbars so I added an off option to the toolbar icon size setting. This way icons can be removed completely. If they are disabled then the text will automatically be used even if the toolbar text option is set to never show. This way you won’t lose your toolbar.

I also made it so that any toolbar that doesn’t have any items on it will be hidden. All of these toolbar changes are in the 0.7.54 release.

GUI – Menubar

Another change to the GUI which won’t be out until the 0.7.55 release is the addition of a configurable menubar. I personally don’t like the toolbar and added support for a menubar. It is configurable in the toolbar configuration are in preferences. Just like the toolbars and right click menus you can configure what is in the menu and what order they appear in.

The main motivation of the menubar addition was the fact that I use a Mac. OS X always shows a menubar outside of the application window. Calibre never looked quite right on a Mac because it doesn’t have a menu so OS X’s menubar would always appear empty.

GUI – OS X

On OS X the menubar has a number of default items. All other OS’s the menubar is default empty and hidden. Also some toolbar items are not shown by default on OS X because they are available though the menubar. The idea is to provide visually appealing default for OS X and to provide a more initiative experience for Mac users.

I’ve also made the toolbar and statusbar on OS X use the system type instead of the generic Qt toolbar and status bar. They look better and behave as one would expect on OS X. The hide toolbar button for instance now works an d hides the toolbar.

Other

Aside from my changes, I’ve been giving direction to Perkin form MobileRead for enhancements to the Textile input and output. The input changes are already in the latest (0.7.54) release. He’s still working on enhancements to Textile output to ensure it produces the same output that the input supports. He has also identified a few bugs with the current Textile output and is working to fix them too.

Tags: , , , , , .

    Comments Off


* calibre APNX GUI Plugin

Posted on March 19th, 2011 by John. Filed under calibre.


The Amazon APNX file generation added to the Kindle device interface has been wildly popular. So popular that people want to use the APNX files without a Kindle. It turns out a large number of calibre uses don’t actually read using a Kindle but using one of the many reading apps Amazon produces (PC, Mac, iPad…). So I’ve created a GUI Plugin that allows users to create and save APNX files from MobiPocket (MOBI, AZW, and PRC) files. It can be found here.

Due to this feature being highly niche (only users of Amazon reading apps will have a use for it) I decided not to make it a part of calibre proper. Instead is being hosted as a 3rd party plugin on. The good news is the new Plugin Updater plugin will support my APNX plugin.

Tags: , , , .

    Comments Off


* Calibre Week in Review

Posted on February 18th, 2011 by John. Filed under calibre.


This is a short week for the week in review because I’m now doing my week from Friday to Thursday. Last week I ened my week on Monday so this review only has a few days worth of work.

TXTZ

I’ve added an import plugin that runs over TXT content when it is added to the library. What happens is the TXT file is scanned looking for Markdown (inline or reference) and Textile image references. It collects all of the images and adds them plus the TXT file to a TXTZ archive when the following conditions are true:

  • Path must not be empty.
  • Path must be a relative path.
  • The mimetype of the image (based on extension) must be an OEB supported image type. (JPG, PNG, SVG, GIF).
  • The image must exist relative to the TXT file’s location and the location specified by the path.

If no images are found referenced in the TXT file or if they images found fail the above tests then a TXTZ archive is not created and the TXT file itself is added to the library.

PML Input:

Fix a bug where TOC entries specified by \x and \X were not being included in the TOC.

Heuristics:

Italcize common cases patterns got tweaked again. One pattern (/text/ would match <br /> </… and cause issues.

Tags: , , , , .

    Comments Off


* Calibre Week in Review

Posted on February 13th, 2011 by John. Filed under calibre.


I’ve been putting up my week in reviews on based on a week starting on Monday for some time now. I’ve been thinking about this and it doesn’t really make much sense. Calibre has a release pretty much every Friday now. So starting next week I’m going to change my week in review to be Friday though Thursday. This way features I talk about in my review will be in the just released version.

TXT Input

First the small changes. Heuristic processing now enables smarten punctuation to further my goal of TXT documents coming out looking great. A change was made to have hard scene breaks separated from the text to ensure it doesn’t accidentally get merged into the paragraph before or after. The formatting type none was renamed to plain to correspond with the formatting output option.

The only big change for TXT input was a new paragraph type option was added. It’s called off. When specified there will be no modifications to the paragraph structure applied to the text. This is especially useful for Markdown and Textile formatted documents. It ensures there are no changes that will cause elements to render incorrectly.

TXTZ Input

A bug caused images to not be included when converting. With Kovid’s help this has been corrected.

TXT Output

I modified Textile output to not write %’s for span tags. The span tag is superfluous in calibre’s Textile output because it does not contain any real information. The span tags are invisible when rendering the XHTML. The %’s cluttered up the resultant TXT so they were removed.

PML Input

PML input saw a lot of of relating to \t and \T tags. The entire handling of these tags was rewritten. Unfortunately, there is no way to have these two tags map one to one to XHTML so only some common cases are handled.

  • \T’s that do not start the line are ignored.
  • \t’s that start and end the line use a margin for the text block.
  • \t’s that start a line and end another line use a margin for the text block.
  • \t’s that start a line but end before a line ending will use a text-indent.
  • \t’s that are in the middle of lines are ignored. open and closed \t blocks within a line are ignored.

Heuristics

Once again the italicize common cases regex was tweaked. This time it was to fix an issue with None being inserted in the text before ajacent underscores. I’m hoping this is the last time for a while that I need to tweak them.

Kindle Interface

The work I did on the APNX format was undertaken for a very real world reason. Integrating APNX generation to calibre’s Kindle device interface plugin.

The 0.7.45 release saw the initial inclusion of this feature. After I received some user feed back I’ve tweaked it for the 0.7.46 release. The 0.7.45 release included a very basic APNX file that would create pages every 1024 bytes of uncompressed HTML.

In 0.7.46 there are a lot of differences. Writing the APNX can be disabled. This is very useful for Kindle 2 users as the Kindle interface works for both Kindle 2 and 3′s.

There are now two parser for generating pages. The default is the fast parser. It uses the uncompressed length of the MOBI HTML and creates pages every 2300 bytes. A few users complained that 1024 created too many pages. About double what you would find in an average paper back book. The 2300 number is a bit more than double 1024 and I chose 2300 after counting the number of characters in a page of an average paper back book. I counted approximately 2240 and added an additional 60 characters to account for markup per page. Thus 2300.

The other parser that can be enabled in the Kindle interface’s setting is the accurate parser. It works by decompressing the MOBI HTML and looking at the actual content. The big difference and why I’m calling it an accurate parser is it looks at the amount of visible text to decide when a page ends and a new one begins. The assumption is there are 30 lines per page and each line can have up to 70 characters. The parser starts a new line every time it encounters a new paragraph and every 70 characters in a paragraph.

The major disadvantage of the accurate parser and why it’s not the default is it’s slow. It requires the text to decompressed and parsed. With a PalmDoc compressed file this can take a few seconds but with a HUFF/CDIC compressed file it can take minutes.

The other minor disadvantage of the accurate parser is it cannot work on DRM content. The fast parser can because the uncompressed text length is stored unencrypted in the MOBI header. If the accurate parser is chosen it will fall back to the fast parser for DRM content. So when ever a Mobipocket book is sent to the Kindle (AZW, MOBI, PRC) an APNX file can and will (unless disabled) be generated.

One thing I will note about the accurate parer is it currently ignores all markup and only looks at text. Meaning it can be made even more accurate by accounting for <div class=”mbp_pagebreak” />, <br>, <hr>, images, margins, and font size changes. I do plan to add support for most if not all of these in the future but since most books people read on their Kindle are pretty much all text and because the accurate parser does a good enough job giving page numbers that correspond to the page length in a paper back book I’m don’t see a pressing need to spend the time on it at this moment.

Tags: , , , , , , , .