Selected text refers to the text that has been selected by a user , probably using the keyboard or the mouse. Selected text is represented as nothing more than a text range. After you find a text selection, it's possible to get the text [getString()] and set the text [setString()]. Although strings are limited to 64KB in size , selections are not. There are some instances, therefore, when you can't use the getString() and setString() methods . Therefore, it's probably better to use a cursor to traverse the selected text and then use the cursor's text object to insert text content. Most problems using selected text look the same at an abstract level.
If nothing is selected then do work on entire document else for each selected area do work on selected area
The difficult part that changes each time is writing a macro that iterates over a selection or between two text ranges.
Text documents support the XTextSectionsSupplier interface (see Table 1), which defines the single method getCurrentSelection(). If there is no current controller (which means that you're an advanced user running OpenOffice.org as a server with no user interface and you won't be looking for selected text anyway), getCurrentSelection() returns a null rather than any selected text.
If the selection count is zero, nothing is selected. I have never seen a selection count of zero, but I check for it anyway. If no text is selected, there is one selection of zero length-the start and end positions are the same. I have seen examples, which I consider unsafe, where a zero-length selection is determined as follows :
If Len(oSel.getString()) = 0 Then nothing is selected
It is possible that selected text will contain more than 64KB characters, and a string cannot contain more than 64KB characters . Therefore, don't check the length of the selected string to see if it is zero; this is not safe. The better solution is to create a text cursor from the selected range and then check to see if the start and end points are the same.
oCursor = oDoc.Text.CreateTextCursorByRange(oSel) If oCursor.IsCollapsed() Then nothing is selected
The macro function in Listing 21 performs the entire check, returning True if something is selected, and False if nothing is selected.
Function IsAnythingSelected(oDoc As Object) As Boolean Dim oSelections 'Contains all of the selections Dim oSel 'Contains one specific selection Dim oCursor 'Text cursor to check for a collapsed range REM Assume nothing is selected IsAnythingSelected = False If IsNull(oDoc) Then Exit Function ' The current selection in the current controller. 'If there is no current controller, it returns NULL. oSelections = oDoc.getCurrentSelection() If IsNull(oSelections) Then Exit Function If oSelections.getCount() = 0 Then Exit Function If oSelections.getCount() > 1 Then REM There is more than one selection so return True IsAnythingSelected = True Else REM There is only one selection so obtain the first selection oSel = oSelections.getByIndex(0) REM Create a text cursor that covers the range and then see if it is REM collapsed. oCursor = oDoc.Text.CreateTextCursorByRange(oSel) If Not oCursor.IsCollapsed() Then IsAnythingSelected = True REM You can also compare these to see if the selection starts and ends at REM the same location. REM If oDoc.Text.compareRegionStarts(oSel.getStart(),_ REM oSel.getEnd()) <> 0 Then REM IsAnythingSelected = True REM End If End If End Function
Obtaining a selection is complicated because it's possible to have multiple non-contiguous selections. Some selections are empty and some are not. If you write code to handle text selection, it should handle all of these cases because they occur frequently. The example in Listing 22 iterates through all selected sections and displays them in a message box.
Sub PrintMultipleTextSelection Dim oSelections 'Contains all of the selections Dim oSel 'Contains one specific selection Dim lWhichSelection As Long 'Which selection to print If NOT IsAnythingSelected(ThisComponent) Then Print "Nothing is selected" Else oSelections = ThisComponent.getCurrentSelection() For lWhichSelection = 0 To oSelections.getCount() - 1 oSel = oSelections.getByIndex(lWhichSelection) MsgBox oSel.getString(), 0, "Selection " & lWhichSelection Next End If End Sub
Selections are text ranges with a start and an end. Although selections have both a start and an end, which side of the text is which is determined by the selection method. For example, position the cursor in the middle of a line, and then select text by moving the cursor either right or left. In both cases, the start position is the same. In one of these cases, the start position is after the end position. The text object provides methods to compare starting and ending positions of text ranges (see Table 13). I use the two methods in Listing 23 to find the leftmost and rightmost cursor position of selected text.
'oSel is a text selection or cursor range 'oText is the text' object Function GetLeftMostCursor(oSel, oText) Dim oRange Dim oCursor If oText.compareRegionStarts(oSel.getEnd(), oSel) >= 0 Then oRange = oSel.getEnd() Else oRange = oSel.getStart() End If oCursor = oText.CreateTextCursorByRange(oRange) oCursor.goRight(0, False) GetLeftMostCursor = oCursor End Function 'oSel is a text selection or cursor range 'oText is the text object Function GetRightMostCursor(oSel, oText) Dim oRange Dim oCursor If oText.compareRegionStarts(oSel.getEnd(), oSel) >= 0 Then oRange = oSel.getStart() Else oRange = oSel.getEnd() End If oCursor = oText.CreateTextCursorByRange(oRange) oCursor.goLeft(0, False) GetRightMostCursor = oCursor End Function
While using text cursors to move through a document, I noticed that cursors remember the direction in which they are traveling. The cursors returned by the macros in Listing 23 are oriented to travel into the text selection by moving the cursor left or right zero characters. This is also an issue while moving a cursor to the right and then turning it around to move left. I always start by moving the cursor zero characters in the desired direction before actually moving the cursor. Then my macro can use these cursors to traverse the selected text from the start (moving right) or the end (moving left).
It took me a long time to understand how to iterate over selected text using cursors. I have, therefore, written many macros that do things in what I consider the wrong way. I now use a framework that returns a two-dimensional array of start and end cursors over which to iterate. Using a framework allows me to use a very minimal code base to iterate over selected text or the entire document. If no text is selected, the framework asks if the macro should use the entire document. If the answer is yes, a cursor is created at the start and the end of the document. If text is selected, each selection is retrieved, and a cursor is obtained at the start and end of each selection. See Listing 24 .
'sPrompt : How to ask if should iterate over the entire text 'oCursors() : Has the return cursors 'Returns True if should iterate and False if should not Function CreateSelectedTextlterator(oDoc, sPrompt$, oCursors()) As Boolean Dim oSelections 'Contains all of the selections Dim oSel 'Contains one specific selection Dim oText 'Document text object Dim lSelCount As Long 'Number of selections Dim lWhichSelection As Long 'Current selection Dim oLCursor, oRCursor 'Temporary cursors CreateSelectedTextIterator = True oText = oDoc.Text If Not IsAnythingSelected(oDoc) Then Dim i% i% = MsgBox("No text selected!" + CHR$(13) + sPrompt, _ 1 OR 32 OR 256, "Warning") If i% = 1 Then oLCursor = oText.createTextCursorByRange(oText.getStart()) oRCursor = oText.createTextCursorByRange(oText.getEnd()) oCursors = DimArray(0, 1) 'Two-Dimensional array with one row oCursors(0, 0) = oLCursor oCursors(0, 1) = oRCursor Else oCursors = DimArray() 'Return an empty array CreateSelectedTextIterator = False End If Else oSelections = oDoc.getCurrentSelection() lSelCount = oSelections.getCount() oCursors = DimArray(lSelCount - 1, 1) For lWhichSelection = 0 To lSelCount - 1 oSel = oSelections.getByIndex(lWhichSelection) oLCursor = GetLeftMostCursor(oSel, oText) oRCursor = GetRightMostCursor(oSel, oText) oCursors(lWhichSelection, 0) = oLCursor oCursors(lWhichSelection, 1) = oRCursor Next End If End Function
Note | The argument oCursors() is an array that is set in the macro in Listing 24. |
The macro in Listing 25 uses the selected text framework to print the Unicode values of the selected text.
Sub PrintUnicodeExamples Dim oCursors(), i% If Not CreateSelectedTextIterator(ThisComponent, _ "Print Unicode for the entire document?", oCursors()) Then Exit Sub For i% = LBound(oCursors()) To UBound(oCursors()) PrintUnicode_worker(oCursors(i%, 0), oCursors(i%, 1), ThisComponent.Text) Next i% End Sub Sub PrintUnicode_worker(oLCursor, oRCursor, oText) Dim s As String 'contains the primary message string Dim ss As String 'used as a temporary string If IsNull(oLCursor) Or IsNull(oRCursor) Or IsNull(oText) Then Exit Sub If oText.compareRegionEnds(oLCursor, oRCursor) <= 0 Then Exit Sub REM Start the cursor in the correct direction with no text selected oLCursor.goRight(0, False) Do While oLCursor.goRight(1, True)_ AND oText.compareRegionEnds(oLCursor, oRCursor) >= 0 ss = oLCursor.getString() REM The string may be empty If Len(ss) > 0 Then s = s & oLCursor.getString() & "=" & ASC(oLCursor.getString()) & " " End If oLCursor.goRight(0, False) Loop msgBox s, 0, "Unicode Values" End Sub
A common request is for a macro that removes extra blank spaces. To remove all empty paragraphs, it's better to use the Remove Blank Paragraphs option from the AutoCorrect dialog (Tools AutoCorrect/AutoFormat Options). To remove only selected paragraphs or runs of blank space, a macro is required.
This section presents a set of macros that replaces all runs of white-space characters with a single white-space character. You can easily modify this macro to delete different types of white space. The different types of space are ordered by importance, so if you have a regular space followed by a new paragraph, the new paragraph stays and the single space is removed. The end effect is that leading and trailing white space is removed from each line.
The term "white space" typically refers to any character that is displayed as a blank space. This includes tabs (ASCII value 9), regular spaces (ASCII value 32), nonbreaking spaces (ASCII value 160), new paragraphs (ASCII value 13), and new lines (ASCII value 10). By encapsulating the definition of white space into a function (see Listing 26 ), you can trivially change the definition of white space to ignore certain characters.
Function IsWhiteSpace(iChar As Integer) As Boolean Select Case iChar Case 9, 10, 13, 32, 160 IsWhiteSpace = True Case Else IsWhiteSpace = False End Select End Function
While removing runs of white space, each character is compared to the character before it. If both characters are white space, the less important character is deleted. For example, if there is both a space and a new paragraph, the space is deleted. The RankChar() function (see Listing 27 ) accepts two characters: the previous character and the current character. The returned integer indicates which, if any, character should be deleted.
'-1 means delete the previous character ' 0 means ignore this character ' 1 means delete this character ' If an input character is 0, this is the start of a line. ' Rank from highest to lowest is: 0, 13, 10, 9, 160, 32 Function RankChar(iPrevChar, iCurChar) As Integer If Not IsWhiteSpace(iCurChar) Then 'Current not white space, ignore it RankChar = 0 ElseIf iPrevChar = 0 Then 'Line start, current is white space RankChar = 1 ' delete the current character. ElseIf Not IsWhiteSpace(iPrevChar) Then 'Current is space but not previous RankChar = 0 ' ignore the current character. REM At this point, both characters are white space ElseIf iPrevChar = 13 Then 'Previous is highest ranked space RankChar = 1 ' delete the current character. ElseIf iCurChar = 13 Then 'Current is highest ranked space RankChar = -1 ' delete the previous character. REM Neither character is a new paragraph, the highest ranked ElseIf iPrevChar = 10 Then 'Previous is new line RankChar = 1 ' delete the current character. ElseIf iCurChar = 10 Then 'Current is new line RankChar = -1 ' delete the previous character. REM At this point, the highest ranking possible is a tab ElseIf iPrevChar = 9 Then 'Previous char is tab RankChar = 1 ' delete the current character. ElseIf iCurChar = 9 Then 'Current char is tab RankChar = -1 ' delete the previous character. ElseIf iPrevChar = 160 Then 'Previous char is a hard space RankChar = 1 ' delete the current character. ElseIf iCurChar = 160 Then 'Current char is a hard space RankChar = -1 ' delete the previous character. ElseIf iPrevChar = 32 Then 'Previous char is a regular space RankChar = 1 ' delete the current character. REM Probably should never get here... both characters are white space REM and the previous is not any known white space character. ElseIf iCurChar = 32 Then 'Current char is a regular space RankChar = -1 ' delete the previous character. Else 'Should probably not get here RankChar = 0 'so simply ignore it! End If End Function
The standard framework is used to remove the empty spaces. The primary routine is simple enough that it barely warrants mentioning (see Listing 28 ).
Sub RemoveEmptySpace Dim oCursors(), i% If Not CreateSelectedTextIterator(ThisComponent, _ "ALL empty space will be removed from the ENTIRE document?", oCursors()) Then Exit Sub For i% = LBOUND(oCursors()) To UBOUND (oCursors()) RemoveEmptySpaceWorker (oCursors(i%, 0), oCursors(i%, 1), ThisComponent.Text) Next i% End Sub
The macro in Listing 29 represents the interesting part of this problem; it decides what is deleted and what is left untouched. Some interesting points should be noted:
Because a text cursor is used, the formatting is left unchanged.
A text range (cursor) may contain text content that returns a zero-length string. This includes, for example, buttons and graphic images contained in the document. Handling exceptional cases adds complexity to the macro. Many tasks are very simple if you ignore the exceptional cases, such as inserted graphics. If you know that your macro will run with simple controlled data, you may choose to sacrifice robustness to reduce complexity. Listing 29 handles the exceptional cases.
If the selected text starts or ends with white space, it will be removed even if it does not start or end the document.
Sub RemoveEmptySpaceWorker(oLCursor, oRCursor, oText) Dim s As String 'Temporary text string Dim i As Integer 'Temporary integer used for comparing text ranges Dim iLastChar As Integer 'Unicode of last character Dim iThisChar As Integer 'Unicode of the current character Dim iRank As Integer 'Integer ranking that decides what to delete REM If something is null, then do nothing If IsNull(oLCursor) Or IsNull(oRCursor) Or IsNull(oText) Then Exit Sub REM Ignore any collapsed ranges If oText.compareRegionEnds(oLCursor, oRCursor) <= 0 Then Exit Sub REM Default the first and last character to indicate start of new line iLastChar = 0 iThisChar = 0 REM Start the leftmost cursor moving toward the end of the document REM and make certain that no text is currently selected. oLCursor.goRight(0, False) REM At the end of the document, the cursor can no longer move right Do While oLCursor.goRight(1, True) REM It is possible that the string is zero length. REM This can happen when stepping over certain objects anchored into REM the text that contain no text. Extra care must be taken because REM this routine can delete these items because the cursor steps over REM them but they have no text length. I arbitrarily call this a regular REM ASCII character without obtaining the string. s = oLCursor.getString() If Len(s) = 0 Then oLCursor.goRight(0, False) iThisChar = 65 Else iThisChar = Asc(oLCursor.getString()) End If REM If at the last character Then always remove white space i = oText.compareRegionEnds(oLCursor, oRCursor) If i = 0 Then If IsWhiteSpace(iThisChar) Then oLCursor.setString(" ") Exit Do End If REM If went past the end then get out If i < 0 Then Exit Do iRank = RankChar(iLastChar, iThisChar) If iRank = 1 Then REM Ready to delete the current character. REM The iLastChar is not changed. REM Deleting the current character by setting the text to the REM empty string causes no text to be selected. 'Print "Deleting Current with " + iLastChar + " and " + iThisChar oLCursor.setString("") ElseIf iRank = -1 Then REM Ready to delete the previous character. One character is already REM selected. It was selected by moving right so moving left two REM deslects the currently selected character and selects the REM character to the left. oLCursor.goLeft(2, True) 'Print "Deleting to the left with " + iLastChar + " and " + iThisChar oLCursor.setString(" ") REM Now the cursor is moved over the current character again but REM this time it is not selected. oLCursor.goRight(1, False) REM Set the previous character to the current character. iLastChar = iThisChar Else REM Instructed to ignore the current character so deselect any text REM and then set the last character to the current character. oLCursor.goRight(0, False) iLastChar = iThisChar End If Loop End Sub
Anyone who has studied algorithms will tell you that a better algorithm is almost always better than a faster computer. An early problem that I solved was counting words in selected text. I created three solutions with varying degrees of success.
My first solution converted the selected text to OOo Basic strings and then manipulated the strings. This solution was fast, counting 8000 words in 2.7 seconds. This solution failed when text strings exceeded 64KB in size, rendering it useless for large documents.
My second solution used a cursor as it walked through the text one character at a time. This solution, although able to handle any length of text, required 47 seconds to count the same 8000 words. In other words, the users found the solution unusably slow.
My final solution used a word cursor, which counted the words in 1.7 seconds.
Tip | To count words correctly, visit Andrew Brown's useful macro Web site: http:/ / www.darwinwars.com/lunatic/ bugs /oo_macros.html . |