Archive for the ‘programming’ Category
* How Find Searches in Sigil 0.5.0
Posted on January 29th, 2012 by John. Filed under Sigil.
There have been some confusion about how find works in, the now released, 0.5.0. The confusion stems from the the 0.4.90x betas. One method was used in the early betas and it was changed later on. This all stems from the regular expression engine being changed from QRegExp to PCRE. The issue at hand is how and when the cursor is taken info account when running a find. In this regard 0.5.0 works no different than 0.4.2.
When doing a count the cursor is ignored. The entire document is taken into account from start to end.
When doing a find next the find starts from the cursor location. Everything before the cursor is ignored and not taken into account. This can, in some cases when using a regular expression, lead to the number of matches being different from the total returned by count. Again, this can only happen when using a regular expression. The reason is a regular expression can have matches that match the expression within a single match. For example:
Expression:
<div>.+</div>
Text:
<div>blah <div> blah </div> blah </div>
The expression will match the text from beginning to end. If you put the cursor to the right of the first < then the math will start from the second div and go to the end. This is because regular expressions can match a variable amount of text. Unlike a fixed expression like “abc” which will always match “abc”.
Finding backwards will match from the start of the document up until the cursor position. This is done by finding all matches from the start to the cursor then using the last match. Again, in the case of regular expressions, a backward find can match different text than a forward find.
Find forward and backward find from the cursor so its position in the document taken into account. In the majority of instances find backward, forward and count will all match the same exact text. However, it is possible, due to their nature, to construct a regular expression that can match differently segments of text within a segment of text depending on where the cursor is located.
The above also applies to replace as a find is run to find the text to replace.
* Consolidation of Sigil Help Forums
Posted on January 27th, 2012 by John. Filed under Sigil.
For some time now Sigil has had two different help forums. One at MobileRead and the other as a Google Group. This has caused quite a bit of confusion because people don’t know the best place to go for help.
I’ve decided to close (it’s already done so don’t ask me to reconsider) the Google Group. This was an easy decision because 1) MobileRead’s sub forum gets more traffic, 2) I use MobileRead and I don’t use the Google Gorup, 3) Most posts on the Google Group were unanswered making it a poor place to go for help.
* Sigil 0.5.0 Released
Posted on January 21st, 2012 by John. Filed under Sigil.
I’m happy to announce the release of Sigil version 0.5.0.
0.5.0 comes with a number of bug fixes and some major new features:
- Inline spell check in code view
- Support for PCRE in search and replace
- Translations into 15 languages
Please see the changelog for a full list of changes in this release.
One smaller change is I’ve decided to drop OpenCandy. Surprisingly I’ve only encountered one complaint about OpenCandy and it was directed at bundling offers for other software inside of an installer not at OpenCandy in particular. I want to make it clear that this change is not due to user request or opposition to OpenCandy but my own decision.
While I respect what OpenCandy the company is doing and I don’t see anything wrong with the offerings they provide I don’t think their system is right for Sigil. The big thing I don’t like is OpenCandy’s installer components are distributed as closed source binary modules. While my understanding is this can be used without running afoul of the GPL I fully believe it goes against the spirit of the GPL and open source in general.
Thank you to everyone who provided feedback and helped during the beta process.
* 0.4.902 (0.5 beta) Avaliable
Posted on December 12th, 2011 by John. Filed under Sigil.
The first beta for 0.5 (0.4.902) is now available.
There are a few new features I’m most interested in getting feedback on. Inline spell check, translations, and the new PCRE engine. Of course crashes and major issues will be looked into and hopefully fixed before the final release.
* Sigil and Data Loss Bugs
Posted on November 8th, 2011 by John. Filed under Sigil.
The majority of the data loss issues have been mitigated at this point. With a work flow of open, save as after major changes and saving after minor ones, catastrophic data loss can be worked around to the point that Sigil can and is being used on a day to day basis.
That said, there are issues with data loss in Sigil and they are a priority. I’m currently finishing up the 0.5 release (I do not have a set release date at this point) which is mainly a feature release and only addresses some of the the data loss issue. For example you can still have everything in an entire XHTML document removed by putting a malformed XML header in the document.
The issue has three components that require major work to fix. I hope to have it all completed for the 0.6 release but it’s going to be some time it’s ready.
The issues are:
1) Sigil currently uses Tidy to clean all XHTML to ensure it conforms (as much as it can) to the XHTML spec. I have seen Tidy remove tags it thinks are empty when they influence how the document is rendered. I want to keep Tidy as part of Sigil but I believe it should only be run when the user asks for it and any changes it makes the user should be able to revert.
2) An intermediate data store is used that requires valid XML is used. This store shuffles data between the book and code view. Due to this store requiring valid XML (valid XHTML conforms) there is the potential for data loss if it has to auto correct the XHTML. If you are in code view and have malformed structural issues with the XHTML and move out of it there is a warning dialog. This only appears when you are working on one file at a time. If you are replacing across multiple files auto correction is used and this can lead to data loss. This data store needs to be replaced with one that does not require valid XML.
3) Putting malformed content into the book view will cause the book view to try to correct it. Again auto correction can lead to data loss. This is mitigated by the malformed error dialog but many users just disable it and find that sections of their document are missing after looking at it in book view. Also, the book view is a WYSIWYG tool so it does make structural changes to the document and these may or may not be what the user expects. As with Tidy changes made by the book view need to be able to be reverted. I am thinking about ways to make the fact that the book view more obvious that it makes changes to the document. This way the user is aware that they need to use undo (doesn’t currently work for book view changes) to revert the changes if they don’t like them. I’m thinking about using a preview mode by default that doesn’t make any changes and an edit mode to make this distinction obvious.
The above issues can be fixed but they are not quick or easy changes. I plan on making them for the 0.6 release as part of the changes necessary to support EPUB 3. However, there is the possibility that they will slip to 0.7 due to how large they are. Unfortunately, all I can say right now is I’m aware of the issue, I know what the cause is, and I have an idea of how to correct it but it’s not going to happen tomorrow.
* Retrieve Formatting Set by QSyntaxHighlighter
Posted on October 29th, 2011 by John. Filed under programming.
I have been working on adding inline spell check to Sigil recently and ran into a quirk on Qt that isn’t immediately obvious. I ended up having to look though the Qt source code to understand exactly what was happening.
When dealing with a QPlainTextEdit you can get the QTextCursor and use the charFormat() function to retrieve the QTextCharFormat for the character before the cursor. This does not work when the formatting is set by a QSyntaxHighlighter!.
charFormat retrieves the character format that has explicitly been set on the QPlainTextEdit. QSyntaxHighlighter does not directly set the formatting on the QPlainTextEdit. Instead QSyntaxHighlighter sets the format in additionalFormats as part of the block layout. All formatting for the block the cursor is currently in can be accessed by using QPlainTextEdit::textCursor().block().layout()->additionalFormats().
QTextLayout::additionalFormats() returns a list of FormatRange objects. A FormatRange gives the start of the formatting (relative to the block not the full text in the QPlainTextEdit), the length and the formatting (as set by the QSyntaxHighlighter). Simply loop over all of the FormatRange objects and check if the cursor is within a range to determine what formatting is applied to a particular part of the block’s text. Use QTextCursor::positionInBlock() to determine the relative position of the cursor within the block.
Here is an example from Sigil that I use for spell checking. It determines if a particular segment of text has the misspelled word style applied to it. It then selects the text.
QTextCursor c = textCursor(); int pos = c.positionInBlock(); foreach (QTextLayout::FormatRange r, textCursor().block().layout()->additionalFormats()) { if (pos >= r.start && pos <= r.start + r.length && r.format.underlineStyle() == QTextCharFormat::SpellCheckUnderline) { c.setPosition(c.block().position() + r.start); c.movePosition(QTextCursor::Right, QTextCursor::KeepAnchor, r.length); setTextCursor(c); break; } }
*Note: QTextEdit can be substituted any place QPlainTextEdit is used. This applies to both not just QPlainTextEdit.
* Sigil Now Supports Translations
Posted on October 8th, 2011 by John. Filed under Sigil.
One of the the new features that has been implemented for 0.5 (release date yet to be determined) is support for Translations. For Sigil’s first supported language Grzegorz Wolszczak has provided a Polish translation. Currently translations are loaded based upon the current system locale. There no support for choosing the language via preferences. This may come at a later time but for now I believe that using the system locale will handle the majority of user needs.
I’ve put together a wiki page with instructions for creating translations. This first revision is a bit basic but as people have questions I plan to update it to make it more robust.
* Sigil Keyboard Shotcuts
Posted on October 1st, 2011 by John. Filed under Sigil.
Thanks to Grzegorz Wolszczak Sigil now (will be part of the 0.5 release) allows users to change keyboard shortcuts for many actions. Grzegorz has been helping out a lot and helped to introduce a preferences dialog and provided user configurable keyboard shortcuts.
* Week in Review
Posted on September 16th, 2011 by John. Filed under calibre, Sigil.
Calibre
This week I focused on PDF output. There was a bug introduced in 0.8.17 that broke PDF output which has now been fixed. I was also able to fix PDF output on OS X. The PDF output engine on OS X is now using OS X’s internal PDF engine instead of Qt’s. Page sizes other than A4 are now possible and the PDFs produced are no longer large image based monstrosities. Meaning, text is now selectable and can be copied.
Sigil
I am currently working on Pearl compatible regular expression (PCRE) support. An initial version has been put into git. I have an enhanced version that allows for case changes in the replacement text working. Right now I’m working caching the results of a search to improve performance.
* Calibre Week in Review
Posted on September 8th, 2011 by John. Filed under calibre.
This week I finally sat down and spend some time with Markdown input and output. Both saw major changes. Markdown input was bumped to upstream version 2.0. Output was completely rewritten from scratch. Markdown output is now completely custom code (not using a third party output module like before). I based the new markdown code off of the Textile output classes I helped Perkin to create.
As with all new code and major changes there are probably bugs. I tested Markdown output with a variety of test material and kept working at it until everything converted acceptably. I also used a variety of the Markdown tests provided by John Gruber to ensure my output was correct. When converting the HTML output tests back to Markdown the output is similar enough to the original that I feel it is acceptable.
The last big change I made this week was adding a new OEB transformation to unsmarten punctuation. As the name implies it changes curly quotes, apostrophes and a few other characters to their plain text, straight equivalents. It basically does the opposite of smarten punctuation. I find this especially useful when converting to formatted (Textile or Markdown) plain text files (TXT).
Tags
Archives
- January 2012 (3)
- December 2011 (2)
- November 2011 (1)
- October 2011 (3)
- September 2011 (9)
- August 2011 (15)
- July 2011 (5)
- June 2011 (3)
- May 2011 (4)
- April 2011 (2)
- March 2011 (2)
- February 2011 (4)
- January 2011 (4)
- December 2010 (2)
- November 2010 (1)
- October 2010 (1)
- August 2010 (3)
- July 2010 (4)
- June 2010 (1)
- May 2010 (2)
- March 2010 (1)
- January 2010 (8)
- December 2009 (5)
- November 2009 (6)
- October 2009 (4)
- September 2009 (2)
- August 2009 (6)
- July 2009 (6)
- June 2009 (4)
- May 2009 (6)
- April 2009 (4)
- March 2009 (2)
- February 2009 (4)
- January 2009 (4)
- December 2008 (7)
- November 2008 (2)
