From 3fd81c73cfd8bad36b2a1cf7955006e35c1d9db5 Mon Sep 17 00:00:00 2001 From: Camil Staps Date: Fri, 23 Oct 2015 16:44:24 +0200 Subject: Assignment 3: code, plots --- Assignment 3/packages/xlrd/doc/xlrd.html | 2064 ++++++++++++++++++++++++++++++ 1 file changed, 2064 insertions(+) create mode 100644 Assignment 3/packages/xlrd/doc/xlrd.html (limited to 'Assignment 3/packages/xlrd/doc/xlrd.html') diff --git a/Assignment 3/packages/xlrd/doc/xlrd.html b/Assignment 3/packages/xlrd/doc/xlrd.html new file mode 100644 index 0000000..25113cc --- /dev/null +++ b/Assignment 3/packages/xlrd/doc/xlrd.html @@ -0,0 +1,2064 @@ + + +
+ +A Python module for extracting data from MS Excel (TM) spreadsheet files.
+
+Version 0.7.4 -- April 2012
+
+Development of this module would not have been possible without the document +"OpenOffice.org's Documentation of the Microsoft Excel File Format" +("OOo docs" for short). +The latest version is available from OpenOffice.org in + PDF format +and + ODT format. +Small portions of the OOo docs are reproduced in this +document. A study of the OOo docs is recommended for those who wish a +deeper understanding of the Excel file layout than the xlrd docs can provide. +
+ +Backporting to Python 2.1 was partially funded by + + Journyx - provider of timesheet and project accounting solutions. + +
+ +Provision of formatting information in version 0.6.1 was funded by + + Simplistix Ltd. + +
+ +This module presents all text strings as Python unicode objects. +From Excel 97 onwards, text in Excel spreadsheets has been stored as Unicode. +Older files (Excel 95 and earlier) don't keep strings in Unicode; +a CODEPAGE record provides a codepage number (for example, 1252) which is +used by xlrd to derive the encoding (for same example: "cp1252") which is +used to translate to Unicode.
+ +If the CODEPAGE record is missing (possible if the file was created +by third-party software), xlrd will assume that the encoding is ascii, and keep going. +If the actual encoding is not ascii, a UnicodeDecodeError exception will be raised and +you will need to determine the encoding yourself, and tell xlrd: +
+ book = xlrd.open_workbook(..., encoding_override="cp1252") ++
If the CODEPAGE record exists but is wrong (for example, the codepage +number is 1251, but the strings are actually encoded in koi8_r), +it can be overridden using the same mechanism. +The supplied runxlrd.py has a corresponding command-line argument, which +may be used for experimentation: +
+ runxlrd.py -e koi8_r 3rows myfile.xls ++
The first place to look for an encoding ("codec name") is + +the Python documentation. +
+ + +In reality, there are no such things. What you have are floating point +numbers and pious hope. +There are several problems with Excel dates:
+ +(1) Dates are not stored as a separate data type; they are stored as +floating point numbers and you have to rely on +(a) the "number format" applied to them in Excel and/or +(b) knowing which cells are supposed to have dates in them. +This module helps with (a) by inspecting the +format that has been applied to each number cell; +if it appears to be a date format, the cell +is classified as a date rather than a number. Feedback on this feature, +especially from non-English-speaking locales, would be appreciated.
+ +(2) Excel for Windows stores dates by default as the number of +days (or fraction thereof) since 1899-12-31T00:00:00. Excel for +Macintosh uses a default start date of 1904-01-01T00:00:00. The date +system can be changed in Excel on a per-workbook basis (for example: +Tools -> Options -> Calculation, tick the "1904 date system" box). +This is of course a bad idea if there are already dates in the +workbook. There is no good reason to change it even if there are no +dates in the workbook. Which date system is in use is recorded in the +workbook. A workbook transported from Windows to Macintosh (or vice +versa) will work correctly with the host Excel. When using this +module's xldate_as_tuple function to convert numbers from a workbook, +you must use the datemode attribute of the Book object. If you guess, +or make a judgement depending on where you believe the workbook was +created, you run the risk of being 1462 days out of kilter.
+ +Reference: +http://support.microsoft.com/default.aspx?scid=KB;EN-US;q180162
+ + +(3) The Excel implementation of the Windows-default 1900-based date system works on the +incorrect premise that 1900 was a leap year. It interprets the number 60 as meaning 1900-02-29, +which is not a valid date. Consequently any number less than 61 is ambiguous. Example: is 59 the +result of 1900-02-28 entered directly, or is it 1900-03-01 minus 2 days? The OpenOffice.org Calc +program "corrects" the Microsoft problem; entering 1900-02-27 causes the number 59 to be stored. +Save as an XLS file, then open the file with Excel -- you'll see 1900-02-28 displayed.
+ +Reference: http://support.microsoft.com/default.aspx?scid=kb;en-us;214326
+ +(4) The Macintosh-default 1904-based date system counts 1904-01-02 as day 1 and 1904-01-01 as day zero. +Thus any number such that (0.0 <= number < 1.0) is ambiguous. Is 0.625 a time of day (15:00:00), +independent of the calendar, +or should it be interpreted as an instant on a particular day (1904-01-01T15:00:00)? +The xldate_* functions in this module +take the view that such a number is a calendar-independent time of day (like Python's datetime.time type) for both +date systems. This is consistent with more recent Microsoft documentation +(for example, the help file for Excel 2002 which says that the first day +in the 1904 date system is 1904-01-02). + +
(5) Usage of the Excel DATE() function may leave strange dates in a spreadsheet. Quoting the help file, +in respect of the 1900 date system: "If year is between 0 (zero) and 1899 (inclusive), +Excel adds that value to 1900 to calculate the year. For example, DATE(108,1,2) returns January 2, 2008 (1900+108)." +This gimmick, semi-defensible only for arguments up to 99 and only in the pre-Y2K-awareness era, +means that DATE(1899, 12, 31) is interpreted as 3799-12-31.
+ +For further information, please refer to the documentation for the xldate_* functions.
+ ++A name is used to refer to a cell, a group of cells, a constant +value, a formula, or a macro. Usually the scope of a name is global +across the whole workbook. However it can be local to a worksheet. +For example, if the sales figures are in different cells in +different sheets, the user may define the name "Sales" in each +sheet. There are built-in names, like "Print_Area" and +"Print_Titles"; these two are naturally local to a sheet. +
+To inspect the names with a user interface like MS Excel, OOo Calc, +or Gnumeric, click on Insert/Names/Define. This will show the global +names, plus those local to the currently selected sheet. +
+A Book object provides two dictionaries (name_map and +name_and_scope_map) and a list (name_obj_list) which allow various +ways of accessing the Name objects. There is one Name object for +each NAME record found in the workbook. Name objects have many +attributes, several of which are relevant only when obj.macro is 1. +
+In the examples directory you will find namesdemo.xls which +showcases the many different ways that names can be used, and +xlrdnamesAPIdemo.py which offers 3 different queries for inspecting +the names in your files, and shows how to extract whatever a name is +referring to. There is currently one "convenience method", +Name.cell(), which extracts the value in the case where the name +refers to a single cell. More convenience methods are planned. The +source code for Name.cell (in __init__.py) is an extra source of +information on how the Name attributes hang together. +
+ +Name information is not extracted from files older than +Excel 5.0 (Book.biff_version < 50)
+ +This collection of features, new in xlrd version 0.6.1, is intended +to provide the information needed to (1) display/render spreadsheet contents +(say) on a screen or in a PDF file, and (2) copy spreadsheet data to another +file without losing the ability to display/render it.
+ +A colour is represented in Excel as a (red, green, blue) ("RGB") tuple +with each component in range(256). However it is not possible to access an +unlimited number of colours; each spreadsheet is limited to a palette of 64 different +colours (24 in Excel 3.0 and 4.0, 8 in Excel 2.0). Colours are referenced by an index +("colour index") into this palette. + +Colour indexes 0 to 7 represent 8 fixed built-in colours: black, white, red, green, blue, +yellow, magenta, and cyan.
+
+The remaining colours in the palette (8 to 63 in Excel 5.0 and later)
+can be changed by the user. In the Excel 2003 UI, Tools/Options/Color presents a palette
+of 7 rows of 8 colours. The last two rows are reserved for use in charts.
+The correspondence between this grid and the assigned
+colour indexes is NOT left-to-right top-to-bottom.
+Indexes 8 to 15 correspond to changeable
+parallels of the 8 fixed colours -- for example, index 7 is forever cyan;
+index 15 starts off being cyan but can be changed by the user.
+
+The default colour for each index depends on the file version; tables of the defaults
+are available in the source code. If the user changes one or more colours,
+a PALETTE record appears in the XLS file -- it gives the RGB values for *all* changeable
+indexes.
+Note that colours can be used in "number formats": "[CYAN]...." and "[COLOR8]...." refer
+to colour index 7; "[COLOR16]...." will produce cyan
+unless the user changes colour index 15 to something else.
+
+
In addition, there are several "magic" colour indexes used by Excel:
+0x18 (BIFF3-BIFF4), 0x40 (BIFF5-BIFF8): System window text colour for border lines
+(used in XF, CF, and WINDOW2 records)
+0x19 (BIFF3-BIFF4), 0x41 (BIFF5-BIFF8): System window background colour for pattern background
+(used in XF and CF records )
+0x43: System face colour (dialogue background colour)
+0x4D: System window text colour for chart border lines
+0x4E: System window background colour for chart areas
+0x4F: Automatic colour for chart border lines (seems to be always Black)
+0x50: System ToolTip background colour (used in note objects)
+0x51: System ToolTip text colour (used in note objects)
+0x7FFF: System window text colour for fonts (used in FONT and CF records)
+Note 0x7FFF appears to be the *default* colour index. It appears quite often in FONT
+records.
+
+
This feature, new in version 0.7.1, is governed by the on_demand argument +to the open_workbook() function and allows saving memory and time by loading +only those sheets that the caller is interested in, and releasing sheets +when no longer required.
+ +on_demand=False (default): No change. open_workbook() loads global data +and all sheets, releases resources no longer required (principally the +str or mmap object containing the Workbook stream), and returns.
+ +on_demand=True and BIFF version < 5.0: A warning message is emitted, +on_demand is recorded as False, and the old process is followed.
+ +on_demand=True and BIFF version >= 5.0: open_workbook() loads global +data and returns without releasing resources. At this stage, the only +information available about sheets is Book.nsheets and Book.sheet_names().
+ +Book.sheet_by_name() and Book.sheet_by_index() will load the requested +sheet if it is not already loaded.
+ +Book.sheets() will load all/any unloaded sheets.
+ +The caller may save memory by calling +Book.unload_sheet(sheet_name_or_index) when finished with the sheet. +This applies irrespective of the state of on_demand.
+ +The caller may re-load an unloaded sheet by calling Book.sheet_by_xxxx() + -- except if those required resources have been released (which will +have happened automatically when on_demand is false). This is the only +case where an exception will be raised.
+ +The caller may query the state of a sheet: +Book.sheet_loaded(sheet_name_or_index) -> a bool
+ +Book.release_resources() may used to save memory and close +any memory-mapped file before proceding to examine already-loaded +sheets. Once resources are released, no further sheets can be loaded.
+ +When using on-demand, it is advisable to ensure that +Book.release_resources() is always called even if an exception +is raised in your own code; otherwise if the input file has been +memory-mapped, the mmap.mmap object will not be closed and you will +not be able to access the physical file until your Python process +terminates. This can be done by calling Book.release_resources() +explicitly in the finally suite of a try/finally block. +New in xlrd 0.7.2: the Book object is a "context manager", so if +using Python 2.5 or later, you can wrap your code in a "with" +statement.
+Parent of almost all other classes in the package.
+For more information about this class, see The BaseObject Class.
+Contents of a "workbook".
+For more information about this class, see The Book Class.
+Contains the data for one cell.
+For more information about this class, see The Cell Class.
+Utility function: (5, 7) => 'H6'
+Utility function: (5, 7) => '$H$6'
+Width and default formatting information that applies to one or +more columns in a sheet.
+For more information about this class, see The Colinfo Class.
+Utility function: 7 => 'H', 27 => 'AB'
+For debugging and analysis: summarise the file's BIFF records. +I.e. produce a sorted file of (record_name, count).
+For debugging: dump the file's BIFF records in char & hex. +
There is one and only one instance of an empty cell -- it's a singleton. This is it. +You may use a test like "acell is empty_cell".
+This mixin class exists solely so that Format, Font, and XF....
+For more information about this class, see The EqNeAttrs Class.
+This dictionary can be used to produce a text version of the internal codes +that Excel uses for error cells. Here are its contents: +
+0x00: '#NULL!', # Intersection of two cell ranges is empty +0x07: '#DIV/0!', # Division by zero +0x0F: '#VALUE!', # Wrong type of operand +0x17: '#REF!', # Illegal or deleted cell reference +0x1D: '#NAME?', # Wrong function or range name +0x24: '#NUM!', # Value range overflow +0x2A: '#N/A', # Argument or function not available ++
An Excel "font" contains the details of not only what is normally +considered a font, but also several other display attributes.
+For more information about this class, see The Font Class.
+"Number format" information from a FORMAT record.
+For more information about this class, see The Format Class.
+Contains the attributes of a hyperlink.
+For more information about this class, see The Hyperlink Class.
+Information relating to a named reference, formula, macro, etc.
+For more information about this class, see The Name Class.
+Represents a user "comment" or "note".
+For more information about this class, see The Note Class.
+Open a spreadsheet file for data extraction.
+Used in evaluating formulas.
+For more information about this class, see The Operand Class.
+Utility function:
+
Ref3D((1, 4, 5, 20, 7, 10)) => 'Sheet2:Sheet3!$H$6:$J$20'
+
Utility function:
+
Ref3D(coords=(0, 1, -32, -22, -13, 13), relflags=(0, 0, 1, 1, 1, 1))
+R1C1 mode => 'Sheet1!R[-32]C[-13]:R[-23]C[12]'
+A1 mode => depends on base cell (browx, bcolx)
+
Represents an absolute or relative 3-dimensional reference to a box +of one or more cells.
+For more information about this class, see The Ref3D Class.
+Height and default formatting information that applies to a row in a sheet.
+For more information about this class, see The Rowinfo Class.
+Contains the data for one worksheet.
+For more information about this class, see The Sheet Class.
+eXtended Formatting information for cells, rows, columns and styles.
+For more information about this class, see The XF Class.
+A collection of the alignment and similar attributes of an XF record.
+For more information about this class, see The XFAlignment Class.
+A collection of the background-related attributes of an XF record.
+For more information about this class, see The XFBackground Class.
+A collection of the border-related attributes of an XF record.
+For more information about this class, see The XFBorder Class.
+A collection of the protection-related attributes of an XF record.
+For more information about this class, see The XFProtection Class.
+Convert an Excel number (presumed to represent a date, a datetime or a time) into +a tuple suitable for feeding to datetime or mx.DateTime constructors.
+Convert a date tuple (year, month, day) to an Excel date.
+Convert a datetime tuple (year, month, day, hour, minute, second) to an Excel date value. +For more details, refer to other xldate_from_*_tuple functions.
+Convert a time tuple (hour, minute, second) to an Excel "date" value (fraction of a day).
+Parent of almost all other classes in the package. Defines a common "dump" method +for debugging.
+Contents of a "workbook". +
WARNING: You don't call this class yourself. You use the Book object that +was returned when you called xlrd.open_workbook("myfile.xls").
+Version of BIFF (Binary Interchange File Format) used to create the file. +Latest is 8.0 (represented here as 80), introduced with Excel 97. +Earliest supported by this module: 2.0 (represented as 20).
+An integer denoting the character set used for strings in this file. +For BIFF 8 and later, this will be 1200, meaning Unicode; more precisely, UTF_16_LE. +For earlier versions, this is used to derive the appropriate Python encoding +to be used to convert to Unicode. +Examples: 1252 -> 'cp1252', 10000 -> 'mac_roman'
+This provides definitions for colour indexes. Please refer to the
+above section "The Palette; Colour Indexes" for an explanation
+of how colours are represented in Excel.
+Colour indexes into the palette map into (red, green, blue) tuples.
+"Magic" indexes e.g. 0x7FFF map to None.
+colour_map is what you need if you want to render cells on screen or in a PDF
+file. If you are writing an output XLS file, use palette_record.
+
-- New in version 0.6.1. Extracted only if open_workbook(..., formatting_info=True)
+
A tuple containing the (telephone system) country code for:
+ [0]: the user-interface setting when the file was created.
+ [1]: the regional settings.
+Example: (1, 61) meaning (USA, Australia).
+This information may give a clue to the correct encoding for an unknown codepage.
+For a long list of observed values, refer to the OpenOffice.org documentation for
+the COUNTRY record.
+
Which date system was in force when this file was last saved.
+ 0 => 1900 system (the Excel for Windows default).
+ 1 => 1904 system (the Excel for Macintosh default).
+
The encoding that was derived from the codepage.
+A list of Font class instances, each corresponding to a FONT record.
+
-- New in version 0.6.1
+
A list of Format objects, each corresponding to a FORMAT record, in
+the order that they appear in the input file.
+It does not contain builtin formats.
+If you are creating an output file using (for example) pyExcelerator,
+use this list.
+The collection to be used for all visual rendering purposes is format_map.
+
-- New in version 0.6.1
+
The mapping from XF.format_key to Format object.
+
-- New in version 0.6.1
+
Time in seconds to extract the XLS image as a contiguous string (or mmap equivalent).
+Time in seconds to parse the data from the contiguous string (or mmap equivalent).
+A mapping from (lower_case_name, scope) to a single Name object.
+
-- New in version 0.6.0
+
A mapping from lower_case_name to a list of Name objects. The list is
+sorted in scope order. Typically there will be one item (of global scope)
+in the list.
+
-- New in version 0.6.0
+
List containing a Name object for each NAME record in the workbook.
+
-- New in version 0.6.0
+
The number of worksheets present in the workbook file. +This information is available even when no sheets have yet been loaded.
+If the user has changed any of the colours in the standard palette, the XLS
+file will contain a PALETTE record with 56 (16 for Excel 4.0 and earlier)
+RGB values in it, and this list will be e.g. [(r0, b0, g0), ..., (r55, b55, g55)].
+Otherwise this list will be empty. This is what you need if you are
+writing an output XLS file. If you want to render cells on screen or in a PDF
+file, use colour_map.
+
-- New in version 0.6.1. Extracted only if open_workbook(..., formatting_info=True)
+
This method has a dual purpose. You can call it to release +memory-consuming objects and (possibly) a memory-mapped file +(mmap.mmap object) when you have finished loading sheets in +on_demand mode, but still require the Book object to examine the +loaded sheets. It is also called automatically (a) when open_workbook +raises an exception and (b) if you are using a "with" statement, when +the "with" block is exited. Calling this method multiple times on the +same object has no ill effect.
+This provides access via name to the extended format information for
+both built-in styles and user-defined styles.
+It maps name to (built_in, xf_index), where:
+name is either the name of a user-defined style,
+or the name of one of the built-in styles. Known built-in names are
+Normal, RowLevel_1 to RowLevel_7,
+ColLevel_1 to ColLevel_7, Comma, Currency, Percent, "Comma [0]",
+"Currency [0]", Hyperlink, and "Followed Hyperlink".
+built_in 1 = built-in style, 0 = user-defined
+xf_index is an index into Book.xf_list.
+References: OOo docs s6.99 (STYLE record); Excel UI Format/Style
+
-- New in version 0.6.1; since 0.7.4, extracted only if
+open_workbook(..., formatting_info=True)
+
What (if anything) is recorded as the name of the last user to save the file.
+A list of XF class instances, each corresponding to an XF record.
+
-- New in version 0.6.1
+
Contains the data for one cell.
+ +WARNING: You don't call this class yourself. You access Cell objects +via methods of the Sheet object(s) that you found in the Book object that +was returned when you called xlrd.open_workbook("myfile.xls").
+Cell objects have three attributes: ctype is an int, value +(which depends on ctype) and xf_index. +If "formatting_info" is not enabled when the workbook is opened, xf_index will be None. +The following table describes the types of cells and how their values +are represented in Python.
+ +Type symbol | +Type number | +Python value | +
---|---|---|
XL_CELL_EMPTY | +0 | +empty string u'' | +
XL_CELL_TEXT | +1 | +a Unicode string | +
XL_CELL_NUMBER | +2 | +float | +
XL_CELL_DATE | +3 | +float | +
XL_CELL_BOOLEAN | +4 | +int; 1 means TRUE, 0 means FALSE | +
XL_CELL_ERROR | +5 | +int representing internal Excel codes; for a text representation, +refer to the supplied dictionary error_text_from_code | +
XL_CELL_BLANK | +6 | +empty string u''. Note: this type will appear only when +open_workbook(..., formatting_info=True) is used. | +
Width and default formatting information that applies to one or +more columns in a sheet. Derived from COLINFO records. + +
Here is the default hierarchy for width, according to the OOo docs:
+
+
"""In BIFF3, if a COLINFO record is missing for a column,
+the width specified in the record DEFCOLWIDTH is used instead.
+
+
In BIFF4-BIFF7, the width set in this [COLINFO] record is only used,
+if the corresponding bit for this column is cleared in the GCW
+record, otherwise the column width set in the DEFCOLWIDTH record
+is used (the STANDARDWIDTH record is always ignored in this case [see footnote!]).
+
+
In BIFF8, if a COLINFO record is missing for a column,
+the width specified in the record STANDARDWIDTH is used.
+If this [STANDARDWIDTH] record is also missing,
+the column width of the record DEFCOLWIDTH is used instead."""
+
+
+Footnote: The docs on the GCW record say this:
+"""
+If a bit is set, the corresponding column uses the width set in the STANDARDWIDTH
+record. If a bit is cleared, the corresponding column uses the width set in the
+COLINFO record for this column.
+
If a bit is set, and the worksheet does not contain the STANDARDWIDTH record, or if
+the bit is cleared, and the worksheet does not contain the COLINFO record, the DEFCOLWIDTH
+record of the worksheet will be used instead.
+
"""
+At the moment (2007-01-17) xlrd is going with the GCW version of the story.
+Reference to the source may be useful: see the computed_column_width(colx) method
+of the Sheet class.
+
-- New in version 0.6.1
+
Value of a 1-bit flag whose purpose is unknown +but is often seen set to 1
+1 = column is collapsed
+1 = column is hidden
+Outline level of the column, in range(7). +(0 = no outline)
+Width of the column in 1/256 of the width of the zero character, +using default font (first FONT record in the file).
+XF index to be used for formatting empty cells.
+This mixin class exists solely so that Format, Font, and XF.... objects +can be compared by value of their attributes.
+An Excel "font" contains the details of not only what is normally
+considered a font, but also several other display attributes.
+Items correspond to those in the Excel UI's Format/Cells/Font tab.
+
-- New in version 0.6.1
+
1 = Characters are bold. Redundant; see "weight" attribute.
+Values: 0 = ANSI Latin, 1 = System default, 2 = Symbol, +77 = Apple Roman, +128 = ANSI Japanese Shift-JIS, +129 = ANSI Korean (Hangul), +130 = ANSI Korean (Johab), +134 = ANSI Chinese Simplified GBK, +136 = ANSI Chinese Traditional BIG5, +161 = ANSI Greek, +162 = ANSI Turkish, +163 = ANSI Vietnamese, +177 = ANSI Hebrew, +178 = ANSI Arabic, +186 = ANSI Baltic, +204 = ANSI Cyrillic, +222 = ANSI Thai, +238 = ANSI Latin II (Central European), +255 = OEM Latin I
+An explanation of "colour index" is given in the Formatting +section at the start of this document.
+1 = Superscript, 2 = Subscript.
+0 = None (unknown or don't care)
+1 = Roman (variable width, serifed)
+2 = Swiss (variable width, sans-serifed)
+3 = Modern (fixed width, serifed or sans-serifed)
+4 = Script (cursive)
+5 = Decorative (specialised, for example Old English, Fraktur)
+
The 0-based index used to refer to this Font() instance. +Note that index 4 is never used; xlrd supplies a dummy place-holder.
+Height of the font (in twips). A twip = 1/20 of a point.
+1 = Characters are italic.
+The name of the font. Example: u"Arial"
+1 = Font is outline style (Macintosh only)
+1 = Font is shadow style (Macintosh only)
+1 = Characters are struck out.
+0 = None
+1 = Single; 0x21 (33) = Single accounting
+2 = Double; 0x22 (34) = Double accounting
+
1 = Characters are underlined. Redundant; see "underline_type" attribute.
+Font weight (100-1000). Standard values are 400 for normal text +and 700 for bold text.
+"Number format" information from a FORMAT record.
+
-- New in version 0.6.1
+
The key into Book.format_map
+The format string
+A classification that has been inferred from the format string.
+Currently, this is used only to distinguish between numbers and dates.
+
Values:
+
FUN = 0 # unknown
+
FDT = 1 # date
+
FNU = 2 # number
+
FGE = 3 # general
+
FTX = 4 # text
+
Contains the attributes of a hyperlink.
+Hyperlink objects are accessible through Sheet.hyperlink_list
+and Sheet.hyperlink_map.
+
-- New in version 0.7.2
+
Description ... this is displayed in the cell, +and should be identical to the cell value. Unicode string, or None. It seems +impossible NOT to have a description created by the Excel UI.
+Index of first column
+Index of first row
+Index of last column
+Index of last row
+The text of the "quick tip" displayed when the cursor +hovers over the hyperlink.
+Target frame. Unicode string. Note: I have not seen a case of this. +It seems impossible to create one in the Excel UI.
+"Textmark": the piece after the "#" in +"http://docs.python.org/library#struct_module", or the Sheet1!A1:Z99 +part when type is "workbook".
+Type of hyperlink. Unicode string, one of 'url', 'unc', +'local file', 'workbook', 'unknown'
+The URL or file-path, depending in the type. Unicode string, except +in the rare case of a local but non-existent file with non-ASCII +characters in the name, in which case only the "8.3" filename is available, +as a bytes (3.x) or str (2.x) string, with unknown encoding. +
Information relating to a named reference, formula, macro, etc.
+
-- New in version 0.6.0
+
-- Name information is not extracted from files older than
+Excel 5.0 (Book.biff_version < 50)
+
This is a convenience method for the use case where the name +refers to one rectangular area in one worksheet.
+0 = Formula definition; 1 = Binary data
No examples have been sighted.
+
0 = User-defined name; 1 = Built-in name +(common examples: Print_Area, Print_Titles; see OOo docs for full list)
+This is a convenience method for the frequent use case where the name +refers to a single cell.
+0 = Simple formula; 1 = Complex formula (array formula or user defined)
+No examples have been sighted.
+
0 = Command macro; 1 = Function macro. Relevant only if macro == 1
+Function group. Relevant only if macro == 1; see OOo docs for values.
+0 = Visible; 1 = Hidden
+0 = Standard name; 1 = Macro name
+A Unicode string. If builtin, decoded as per OOo docs.
+The index of this object in book.name_obj_list
+An 8-bit string.
+The result of evaluating the formula, if any. +If no formula, or evaluation of the formula encountered problems, +the result is None. Otherwise the result is a single instance of the +Operand class.
+-1: The name is global (visible in all calculation sheets).
+-2: The name belongs to a macro sheet or VBA sheet.
+-3: The name is invalid.
+0 <= scope < book.nsheets: The name is local to the sheet whose index is scope.
+
0 = Sheet macro; 1 = VisualBasic macro. Relevant only if macro == 1
+ Represents a user "comment" or "note".
+Note objects are accessible through Sheet.cell_note_map.
+
-- New in version 0.7.2
+
Author of note
+True if the containing column is hidden
+Column index
+List of (offset_in_string, font_index) tuples. +Unlike Sheet.rich_text_runlist_map, the first offset should always be 0. +
True if the containing row is hidden
+Row index
+True if note is always shown
+Text of the note
+Used in evaluating formulas. +The following table describes the kinds and how their values +are represented.
+ +Kind symbol | +Kind number | +Value representation | +
---|---|---|
oBOOL | +3 | +integer: 0 => False; 1 => True | +
oERR | +4 | +None, or an int error code (same as XL_CELL_ERROR in the Cell class). + | +
oMSNG | +5 | +Used by Excel as a placeholder for a missing (not supplied) function +argument. Should *not* appear as a final formula result. Value is None. | +
oNUM | +2 | +A float. Note that there is no way of distinguishing dates. | +
oREF | +-1 | +The value is either None or a non-empty list of
+absolute Ref3D instances. + |
+
oREL | +-2 | +The value is None or a non-empty list of +fully or partially relative Ref3D instances. + | +
oSTRG | +1 | +A Unicode string. | +
oUNK | +0 | +The kind is unknown or ambiguous. The value is None | +
oUNK means that the kind of operand is not known unambiguously.
+The reconstituted text of the original formula. Function names will be +in English irrespective of the original language, which doesn't seem +to be recorded anywhere. The separator is ",", not ";" or whatever else +might be more appropriate for the end-user's locale; patches welcome.
+None means that the actual value of the operand is a variable +(depends on cell data), not a constant.
+Represents an absolute or relative 3-dimensional reference to a box
+of one or more cells.
+-- New in version 0.6.0
+
The coords attribute is a tuple of the form:
+(shtxlo, shtxhi, rowxlo, rowxhi, colxlo, colxhi)
+where 0 <= thingxlo <= thingx < thingxhi.
+Note that it is quite possible to have thingx > nthings; for example
+Print_Titles could have colxhi == 256 and/or rowxhi == 65536
+irrespective of how many columns/rows are actually used in the worksheet.
+The caller will need to decide how to handle this situation.
+Keyword: IndexError :-)
+
The components of the coords attribute are also available as individual +attributes: shtxlo, shtxhi, rowxlo, rowxhi, colxlo, and colxhi.
+ +The relflags attribute is a 6-tuple of flags which indicate whether
+the corresponding (sheet|row|col)(lo|hi) is relative (1) or absolute (0).
+Note that there is necessarily no information available as to what cell(s)
+the reference could possibly be relative to. The caller must decide what if
+any use to make of oREL operands. Note also that a partially relative
+reference may well be a typo.
+For example, define name A1Z10 as $a$1:$z10 (missing $ after z)
+while the cursor is on cell Sheet3!A27.
+The resulting Ref3D instance will have coords = (2, 3, 0, -16, 0, 26)
+and relflags = (0, 0, 0, 1, 0, 0).
+So far, only one possibility of a sheet-relative component in
+a reference has been noticed: a 2D reference located in the "current sheet".
+
This will appear as coords = (0, 1, ...) and relflags = (1, 1, ...).
+
Height and default formatting information that applies to a row in a sheet.
+Derived from ROW records.
+
-- New in version 0.6.1
height: Height of the row, in twips. One twip == 1/20 of a point.
+ +has_default_height: 0 = Row has custom height; 1 = Row has default height.
+ +outline_level: Outline level of the row (0 to 7)
+ +outline_group_starts_ends: 1 = Outline group starts or ends here (depending on where the +outline buttons are located, see WSBOOL record [TODO ??]), +and is collapsed
+ +hidden: 1 = Row is hidden (manually, or by a filter or outline group)
+ +height_mismatch: 1 = Row height and default font height do not match
+ +has_default_xf_index: 1 = the xf_index attribute is usable; 0 = ignore it
+ +xf_index: Index to default XF record for empty cells in this row. +Don't use this if has_default_xf_index == 0.
+ +additional_space_above: This flag is set, if the upper border of at least one cell in this row +or if the lower border of at least one cell in the row above is +formatted with a thick line style. Thin and medium line styles are not +taken into account.
+ +additional_space_below: This flag is set, if the lower border of at least one cell in this row +or if the upper border of at least one cell in the row below is +formatted with a medium or thick line style. Thin line styles are not +taken into account.
+Contains the data for one worksheet.
+ +In the cell access functions, "rowx" is a row index, counting from zero, and "colx" is a +column index, counting from zero. +Negative values for row/column indexes and slice positions are supported in the expected fashion.
+ +For information about cell types and cell values, refer to the documentation of the Cell class.
+ +WARNING: You don't call this class yourself. You access Sheet objects via the Book object that +was returned when you called xlrd.open_workbook("myfile.xls").
+A reference to the Book object to which this sheet belongs. +Example usage: some_sheet.book.datemode
+Cell object in the given row and column. +
A sparse mapping from (rowx, colx) to a Note object.
+Cells not containing a note ("comment") are not mapped.
+
-- New in version 0.7.2
Type of the cell in the given row and column. +Refer to the documentation of the Cell class. +
Value of the cell in the given row and column.
+XF index of the cell in the given row and column.
+This is an index into Book.xf_list.
+
-- New in version 0.6.1
+
Returns a sequence of the Cell objects in the given column. +
List of address ranges of cells containing column labels.
+These are set up in Excel by Insert > Name > Labels > Columns.
+
-- New in version 0.6.0
+
How to deconstruct the list:
+
+for crange in thesheet.col_label_ranges: + rlo, rhi, clo, chi = crange + for rx in xrange(rlo, rhi): + for cx in xrange(clo, chi): + print "Column label at (rowx=%d, colx=%d) is %r" \ + (rx, cx, thesheet.cell_value(rx, cx)) ++
Returns a slice of the Cell objects in the given column. +
Returns a slice of the types of the cells in the given column.
+Returns a slice of the values of the cells in the given column.
+The map from a column index to a Colinfo object. Often there is an entry
+in COLINFO records for all column indexes in range(257).
+Note that xlrd ignores the entry for the non-existent
+257th column. On the other hand, there may be no entry for unused columns.
+
-- New in version 0.6.1. Populated only if open_workbook(formatting_info=True).
+
Determine column display width.
+
-- New in version 0.6.1
+
+
Default value to be used for a row if there is +no ROW record for that row. +From the optional DEFAULTROWHEIGHT record. +
Default value to be used for a row if there is +no ROW record for that row. +From the optional DEFAULTROWHEIGHT record. +
Default value to be used for a row if there is +no ROW record for that row. +From the optional DEFAULTROWHEIGHT record. +
Default value to be used for a row if there is +no ROW record for that row. +From the optional DEFAULTROWHEIGHT record. +
Default value to be used for a row if there is +no ROW record for that row. +From the optional DEFAULTROWHEIGHT record. +
Default column width from DEFCOLWIDTH record, else None.
+From the OOo docs:
+"""Column width in characters, using the width of the zero character
+from default font (first FONT record in the file). Excel adds some
+extra space to the default width, depending on the default font and
+default font size. The algorithm how to exactly calculate the resulting
+column width is not known.
+Example: The default width of 8 set in this record results in a column
+width of 8.43 using Arial font with a size of 10 points."""
+For the default hierarchy, refer to the Colinfo class.
+
-- New in version 0.6.1
+
A 256-element tuple corresponding to the contents of the GCW record for this sheet. +If no such record, treat as all bits zero. +Applies to BIFF4-7 only. See docs of the Colinfo class for discussion. +
Boolean specifying if a PANE record was present, ignore unless you're xlutils.copy
+A list of the horizontal page breaks in this sheet.
+Breaks are tuples in the form (index of row after break, start col index, end col index).
+Populated only if open_workbook(formatting_info=True).
+
-- New in version 0.7.2
+
Index of first visible row in bottom frozen/split pane
+Number of rows in top pane (frozen panes; for split panes, see comments below in code)
+A list of Hyperlink objects corresponding to HLINK records found
+in the worksheet.
-- New in version 0.7.2
A sparse mapping from (rowx, colx) to an item in hyperlink_list.
+Cells not covered by a hyperlink are not mapped.
+It is possible using the Excel UI to set up a hyperlink that
+covers a larger-than-1x1 rectangle of cells.
+Hyperlink rectangles may overlap (Excel doesn't check).
+When a multiply-covered cell is clicked on, the hyperlink that is activated
+(and the one that is mapped here) is the last in hyperlink_list.
+
-- New in version 0.7.2
List of address ranges of cells which have been merged.
+These are set up in Excel by Format > Cells > Alignment, then ticking
+the "Merge cells" box.
+
-- New in version 0.6.1. Extracted only if open_workbook(formatting_info=True).
+
How to deconstruct the list:
+
+for crange in thesheet.merged_cells: + rlo, rhi, clo, chi = crange + for rowx in xrange(rlo, rhi): + for colx in xrange(clo, chi): + # cell (rlo, clo) (the top left one) will carry the data + # and formatting info; the remainder will be recorded as + # blank cells, but a renderer will apply the formatting info + # for the top left cell (e.g. border, pattern) to all cells in + # the range. ++
Name of sheet.
+Nominal number of columns in sheet. It is 1 + the maximum column index +found, ignoring trailing empty cells. See also open_workbook(ragged_rows=?) +and Sheet.row_len(row_index). +
Number of rows in sheet. A row index is in range(thesheet.nrows).
+Mapping of (rowx, colx) to list of (offset, font_index) tuples. The offset
+defines where in the string the font begins to be used.
+Offsets are expected to be in ascending order.
+If the first offset is not zero, the meaning is that the cell's XF's font should
+be used from offset 0.
+
This is a sparse mapping. There is no entry for cells that are not formatted with
+rich text.
+
How to use:
+
+runlist = thesheet.rich_text_runlist_map.get((rowx, colx)) +if runlist: + for offset, font_index in runlist: + # do work here. + pass ++Populated only if open_workbook(formatting_info=True). +
Returns a sequence of the Cell objects in the given row. +
List of address ranges of cells containing row labels.
+For more details, see col_label_ranges above.
+
-- New in version 0.6.0
+
Returns the effective number of cells in the given row. For use with
+open_workbook(ragged_rows=True) which is likely to produce rows
+with fewer than ncols cells.
+
-- New in version 0.7.2
+
Returns a slice of the Cell objects in the given row. +
Returns a slice of the types +of the cells in the given row.
+Returns a slice of the values +of the cells in the given row.
+The map from a row index to a Rowinfo object. Note that it is possible
+to have missing entries -- at least one source of XLS files doesn't
+bother writing ROW records.
+
-- New in version 0.6.1. Populated only if open_workbook(formatting_info=True).
+
Frozen panes: ignore it. Split panes: explanation and diagrams in OOo docs.
+Default column width from STANDARDWIDTH record, else None.
+From the OOo docs:
+"""Default width of the columns in 1/256 of the width of the zero
+character, using default font (first FONT record in the file)."""
+For the default hierarchy, refer to the Colinfo class.
+
-- New in version 0.6.1
+
Index of first visible column in right frozen/split pane
+Number of columns in left pane (frozen panes; for split panes, see comments below in code)
+A list of the vertical page breaks in this sheet.
+Breaks are tuples in the form (index of col after break, start row index, end row index).
+Populated only if open_workbook(formatting_info=True).
+
-- New in version 0.7.2
+
Visibility of the sheet. 0 = visible, 1 = hidden (can be unhidden +by user -- Format/Sheet/Unhide), 2 = "very hidden" (can be unhidden +only by VBA macro).
+eXtended Formatting information for cells, rows, columns and styles.
+
-- New in version 0.6.1
+
+
Each of the 6 flags below describes the validity of
+a specific group of attributes.
+
+In cell XFs, flag==0 means the attributes of the parent style XF are used,
+(but only if the attributes are valid there); flag==1 means the attributes
+of this XF are used.
+In style XFs, flag==0 means the attribute setting is valid; flag==1 means
+the attribute should be ignored.
+Note that the API
+provides both "raw" XFs and "computed" XFs -- in the latter case, cell XFs
+have had the above inheritance mechanism applied.
+
+
An instance of an XFAlignment object.
+An instance of an XFBackground object.
+An instance of an XFBorder object.
+Index into Book.font_list
+Key into Book.format_map +
+Warning: OOo docs on the XF record call this "Index to FORMAT record". +It is not an index in the Python sense. It is a key to a map. +It is true only for Excel 4.0 and earlier files +that the key into format_map from an XF instance +is the same as the index into format_list, and only +if the index is less than 164. +
+0 = cell XF, 1 = style XF
+cell XF: Index into Book.xf_list
+of this XF's style XF
+style XF: 0xFFF
+
An instance of an XFProtection object.
+Index into Book.xf_list
+A collection of the alignment and similar attributes of an XF record.
+Items correspond to those in the Excel UI's Format/Cells/Alignment tab.
+
-- New in version 0.6.1
+
Values: section 6.115 (p 214) of OOo docs
+A number in range(15).
+Values: section 6.115 (p 215) of OOo docs.
+Note: file versions BIFF7 and earlier use the documented
+"orientation" attribute; this will be mapped (without loss)
+into "rotation".
+
1 = shrink font size to fit text into cell.
+0 = according to context; 1 = left-to-right; 2 = right-to-left
+1 = text is wrapped at right margin
+Values: section 6.115 (p 215) of OOo docs
+A collection of the background-related attributes of an XF record.
+Items correspond to those in the Excel UI's Format/Cells/Patterns tab.
+An explanation of "colour index" is given in the Formatting
+section at the start of this document.
+
-- New in version 0.6.1
+
See section 3.11 of the OOo docs.
+See section 3.11 of the OOo docs.
+See section 3.11 of the OOo docs.
+A collection of the border-related attributes of an XF record. +Items correspond to those in the Excel UI's Format/Cells/Border tab.
+An explanations of "colour index" is given in the Formatting +section at the start of this document. +There are five line style attributes; possible values and the +associated meanings are: +0 = No line, +1 = Thin, +2 = Medium, +3 = Dashed, +4 = Dotted, +5 = Thick, +6 = Double, +7 = Hair, +8 = Medium dashed, +9 = Thin dash-dotted, +10 = Medium dash-dotted, +11 = Thin dash-dot-dotted, +12 = Medium dash-dot-dotted, +13 = Slanted medium dash-dotted. +The line styles 8 to 13 appear in BIFF8 files (Excel 97 and later) only. +For pictures of the line styles, refer to OOo docs s3.10 (p22) +"Line Styles for Cell Borders (BIFF3-BIFF8)".
+The colour index for the cell's bottom line
+The line style for the cell's bottom line
+The colour index for the cell's diagonal lines, if any
+1 = draw a diagonal from top left to bottom right
+The line style for the cell's diagonal lines, if any
+1 = draw a diagonal from bottom left to top right
+The colour index for the cell's left line
+The line style for the cell's left line
+The colour index for the cell's right line
+The line style for the cell's right line
+The colour index for the cell's top line
+The line style for the cell's top line
+A collection of the protection-related attributes of an XF record.
+Items correspond to those in the Excel UI's Format/Cells/Protection tab.
+Note the OOo docs include the "cell or style" bit
+in this bundle of attributes.
+This is incorrect; the bit is used in determining which bundles to use.
+
-- New in version 0.6.1
+
1 = Cell is prevented from being changed, moved, resized, or deleted +(only if the sheet is protected).
+1 = Hide formula so that it doesn't appear in the formula bar when +the cell is selected (only if the sheet is protected).
+