Welcome to table-compositor’s documentation!¶
Contents:
Introduction¶
The table-compositor library provides the API to render data stored in table-like data structures. Currently the library only supports rendering data available in a Panda’s DataFrames. The DataFrame layout is used as the table layout(including single and multi hierarchical columns/indices) by the library. The table layout is rendered directly on to an xslx sheet or to a html page. Styling and layout attributes can be used to render colorful xlsx or html reports. The library also supports rendering of multiple data frames into a single xlsx sheet or html page (with horizontal/vertical layouts). The objective of the library is to be able to use the DataFrame as the API to configure the style and layout properties of the report. Callback functions are provided to customize all styling properties. The nice thing about the callback functions are that the style properties are set on cells indexed with index/column values available in the original dataframe used during rendering.
Code: https://github.com/InvestmentSystems/table-compositor
Docs: http://table-compositor.readthedocs.io
Packages: https://pypi.python.org/pypi/table-compositor
Getting Started¶
The table-compositor library builds on the concept of Panda’s DataFrame as API to render colorful reports. The various ways of providing styling attributes and choosing layouts are demonstrated with numerous examples in the documentation. Please refer to the Bacis section of the documentation to get started with the HelloWorld example.
Installation¶
A standard setuptools installer is available via PyPI:
https://pypi.python.org/pypi/table-compositor
Or, install via pip3:
pip3 install table-compositor
Source code can be obtained here:
Basics¶
The purpose of this library is to use the Pandas DataFrame as an interface to represent the layout of a table that needs to be rendered to an xlsx file or as an html table. The library abstracts away the tedious work of working at the cell level of an xlsx sheet or a html table. It provides a call-back mechanism by which the user is able to provide values that need to be rendered and also the styling that needs to be used for each cell in the rendered table. The library is also capable of laying out multiple tables in the same sheet which are evenly spaced vertically or horizontally based on the layout configuration provided.
Sample Data¶
During the later part of this documentation, we will use the sample data from the Social Security Administration which contains the U.S. child birth name records. We choose this sample data for two reasons. We reuse some of the discussion that are outlined by Wes McKinnery’s Python For Data Analysis, 2nd Edition(2017). The same data is also used in the documentation of another library function-pipe <http://function-pipe.readthedocs.io/en/latest/index.html> that the Investment Systems Group has open-sourced.
https://www.ssa.gov/oact/babynames/names.zip
Further more, we will assume that a flattened file from all the smaller files in the .zip file is available after we invoke the following function.
Please refer to the XLSX Examples section for code that loads this data.
A Hello World Example: Dataframe to Xlsx¶
Every use of this library involves four steps.
We build a dataframe that resembles the shape of the table that will be rendered.
The dataframe is passed as an argument to the function called
build_presentation_model
. This function accepts a dataframe and also a number of functions as arguments. We call the value returned by this function, the presentation_model.Create a layout of multiple presentation models (if we want more than one table rendered in same xlsx sheet or same html page)
Call the render_xlsx or render_html functions on the respective writers. For xlsx files either OpenPyxlCompositor(uses openpyxl library) or XlsxWriterCompositor(uses xlsxwriter library). For HTML use the HTMLWriter.
A Quick Look at a Xlsx example¶
We will start with a simple dataframe and render the dataframe as-is to a xlsx file
import pandas as pd
from table_compositor.table_compositor import build_presentation_model
from table_compositior.xlsx_writer import OpenPyxlCompositor
# Note: use XlsxWriterCompositor to use xlsxwriter library
sample_df = pd.DataFrame(dict(a=[10, 20, 30, 40, 50], b=[0.1, 0.9,0.2, 0.6,0.3]), index=[1,2,3,4,5])
# create a presentation model
# defaults to engine='openpyxl'. Needs to be set to 'xlsxwriter' to use `xlsxwriter` library instead.
presentation_model = build_presentation_model(df=sample_df)
# create a layout, which is usually a nested list of presentation models
layout = [presentation_model]
# render to xlsx
output_fp = '/tmp/example1.xlsx'
OpenPyxlCompositor.to_xlsx(layout, output_fp=output_fp)
Running this code produces the following output:

In the above code snippet, we first created a dataframe called sample_df
.
To render this dataframe, we first invoke build_presentation_model. The build_presentation_model accepts the dataframe as its first argument. In this example, we use the defaults provided by this method for all other arguments. The build_presentation_model returns an presentation_model object.
Before we call OpenPyxlCompositor.to_xlsx we create a layout. A layout is a nested list of presentation_models. In our case, since we have only one presentation_model we create a list with a single element. Later on when we work with multiple presentation models that need to be rendered on to the same sheet, we could create nested list such as [[model1, model2], [model3]] etc.
Building the Presentation Model¶
The build_presentation_model function is the most important interface in this library. This function exposes all the functionality that is required to render beautiful looking excel worksheets or html tables.
We will now build up on our previous example and add styling to the report we generate. Before, we do that lets take a quick look at the signature of build_presentation__model.
- table_compositor.table_compositor.build_presentation_model(*, df, output_format='xlsx', data_value_func=None, column_style_func=None, data_style_func=None, header_style_func=None, header_value_func=None, index_style_func=None, index_value_func=None, index_name_func=None, index_name_style_func=None, engine='openpyxl', **kwargs)[source]¶
Construct and return the presentation model that will be used while rendering to html/xlsx formats. The returned object has all the information required to render the tables in the requested format. The details of the object is transparent to the caller. It is only exposed for certain advanced operations.
- Parameters
df – The dataframe representation of the table. The shape of the dataframe closely resembles the table that will be rendered in the requested format.
output_format – ‘html’ or ‘xlsx’
data_value_func – example: lambda idx, col: df.loc[idx, col], assuming df is in the closure. This can be None, if no data transformation is required to the values already present in the source df
column_style_func – the function can substitute the data_style_func, if the same style can be applied for the whole column. This argument should be prefered over the data_style_func argument. Using this option provides better performance since the fewer objects will be created internally and fewer callbacks are made to this function when compared to data_style_func.This argument only applies to the data contained in the dataframe and not the cell where the headers are rendered. For fine grained control at cell level, the data_style_func argument can be used. For more information on return values of this function, refer to the documentation for data_style_func argument.
data_style_func – used to provide style at the cell level. Example: lambda idx, col: return dict(font=Font(…)), where Font is the openpyxl object and font is the attr available in the cell instance of openpyxl. For xlsx, the keys in the dict are the attrs of the cell object in openpyxl and the values correspond to the value of that attribute. Example are found in xlsx_styles module. For html, the key-value pairs are any values that go into to the style attribute of a td, th cell in html. Examples are found in html_styles module. example: dict(background-color=’#F8F8F8’). When performance becomes an issue, and cell level control is not needed, it is recommended to use the column_style_func argument rathat than this argument. If the prefered engine is XlswWriter, then the style dictionary returned should have key/values compatible with the Format object declarted in the XlsxWriter library. A reference can be found in ``xlsx_styles.XlsxWriterStyleHelper` class
header_value_func – func that takes a object of type IndexNode. The IndexNode contains the attributes that refer to the header being rendered. The returned value from this function is displayed in place of the header in the dataframe at the location. The two properties available on the IndexNode object are value and key. The key is useful to identify the exact index and level in context while working with multi-hierarchical columns.
header_style_func – func that takes a object of type IndexNode. The return value of this function is similar to data_style_func.
index_value_func – func that takes a object of type IndexNode. The IndexNode contains the attributes that refer to the index being rendered. The returned value from this function is displayed in place of the index in the dataframe at the location.
index_style_func – func that takes a object of type IndexNode. The return value of this function is similar to data_style_func.
index_name_func – func that returns a string for index name (value to be displayed on top-left corner, above the index column)
index_name_style – the style value same as data_style_func that will be used to style the cell
engine – required while building presentation model for xlsx. Argument ignored for HTML rendering. This argument is used to provide the default callback style functions, where the style dictionary returned by the callback functions should be compatible with the engine being used.
kwargs –
‘hide_index’ - if True, then hide the index column, default=False
’hide_header, - if True, then hide the header, default=False
’use_convert’ - if True, do some conversions from dataframe values to values excel can understand for example np.NaN are converted to NaN strings
- Returns
A presentation model, to be used to create layout and provide the layout to the html or xlsx writers.
About the callback functions provided as arguments:
Note that callback function provided as arguments to this function are provided with either a tuple of index, col arguments are some information regarding the index or headers being rendered. Therefore, a common pattern would be to capture the dataframe being rendered in a closure of this callback func before passing them as arugments.
For example:
df = pd.DataFrame(dict(a=[1, 2, 3]))
- def data_value_func():
- def _inner(idx, col):
return df.loc[idx, col] * 10.3
return _inner
pm = build_presentation_model(df=df, data_value_func=data_value_func())
Improving on our first iteration¶
Now, that we got a overview of the build_presentation_mode function, lets try setting these arguments to improve the look of our reports.
Say, we have the following requirements:
Display column ‘A’ as in dollar format.
Display column ‘B’ as percentage values.’
Set back-ground color of column ‘B’ to red if value is less than 50%
Capitalize all the column headers and add a yellow background
Multiply all index values by 100 while rendering and add a color to the background.
Display a ‘custom text’ on the top left corner, where pandas whole usually display the index name if available.
We update our previous example to do the following:
1import os
2import tempfile
3
4import pandas as pd
5
6from table_compositor.table_compositor import build_presentation_model
7from table_compositor.xlsx_styles import OpenPyxlStyleHelper
8
9# There are equivalent classes for using xlsxwriter library. Namely,
10# XlsxWriterCompositor and XlsxWriterStyleHelper
11from table_compositor.xlsx_writer import OpenPyxlCompositor
12
1def basic_example2():
2
3 df = pd.DataFrame(
4 dict(a=[10, 20, 30, 40, 50], b=[0.1, 0.9, 0.2, 0.6, 0.3]), index=[1, 2, 3, 4, 5]
5 )
6
7 def style_func(idx, col):
8 if col == "b":
9 return OpenPyxlStyleHelper.get_style(number_format="0.00%")
10 else:
11 # for 'a' we do dollar format
12 return OpenPyxlStyleHelper.get_style(number_format="$#,##.00")
13
14 # create a presentation model
15 # note the OpenPyxlStyleHelper function available in xlsx_styles module. But a return value of style function
16 # can be any dict whose keys are attributes of the OpenPyxl cell object.
17 presentation_model = build_presentation_model(
18 df=df,
19 data_value_func=lambda idx, col: df.loc[idx, col] * 10
20 if col == "a"
21 else df.loc[idx, col],
22 data_style_func=style_func,
23 header_value_func=lambda node: node.value.capitalize(),
24 header_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
25 index_value_func=lambda node: node.value * 100,
26 index_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
27 index_name_func=lambda _: "Basic Example",
28 index_name_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
29 )
30
31 # create a layout, which is usually a nested list of presentation models
32 layout = [presentation_model]
33
34 # render to xlsx
35 output_fp = os.path.join(tempfile.gettempdir(), "basic_example2.xlsx")
36 OpenPyxlCompositor.to_xlsx(layout=layout, output_fp=output_fp)
37
38
On line 3 we create the dataframe.
To satisfy the requirements we listed above we pass the callback function to the build_presentation_model. Note that some helper functions are available in xlsx_style function to create styles for openpyxl. But, any other dict with keys that are attr of cell object of openpyxl should work. The above example produces the output as shown below:

Multi-hierarchical columns and indices¶
Rendering dataframes with multi-hierarchical columns or indices are very similar to rendering the simpler dataframes. The data_value_func and data_style_func work the same way. The functions that handle index cell rendering and column header rendering can access the IndexNode object that is passed to those functions to determine the value and level that is currently being rendered. This becomes clearer with an example.
We demonstrate this by setting a variety of colors to each cell that holds one of the values of the hierarchical columns or indices.
Note that the IndexNode argument passed to the callback function has a node.key field that unique identifies each cell with a name that is built appending the value of each item in the index or column hierarchy.
1import os
2import tempfile
3
4import pandas as pd
5
6from table_compositor.table_compositor import build_presentation_model
7from table_compositor.xlsx_styles import OpenPyxlStyleHelper
8
9# There are equivalent classes for using xlsxwriter library. Namely,
10# XlsxWriterCompositor and XlsxWriterStyleHelper
11from table_compositor.xlsx_writer import OpenPyxlCompositor
12
1def basic_example3():
2
3 df = pd.DataFrame(
4 dict(
5 a=[10, 20, 30, 40],
6 b=[0.1, 0.9, 0.2, 0.6],
7 d=[50, 60, 70, 80],
8 e=[200, 300, 400, 500],
9 )
10 )
11 df.columns = pd.MultiIndex.from_tuples(
12 [("A", "x"), ("A", "y"), ("B", "x"), ("B", "y")]
13 )
14 df.index = pd.MultiIndex.from_tuples([(1, 100), (1, 200), (2, 100), (2, 200)])
15 print(df)
16
17 def index_style_func(node):
18 # node.key here could be one of (1,), (1, 100), (2,), (2, 100), (2, 200)
19 bg_color = "FFFFFF"
20 if node.key == (1,) or node.key == (2,):
21 bg_color = "9E80B8"
22 elif node.key[1] == 100:
23 bg_color = "4F90C1"
24 elif node.key[1] == 200:
25 bg_color = "6DC066"
26 return OpenPyxlStyleHelper.get_style(bg_color=bg_color)
27
28 def header_style_func(node):
29 bg_color = "FFFFFF"
30 if node.key == ("A",) or node.key == ("B",):
31 bg_color = "9E80B8"
32 elif node.key[1] == "x":
33 bg_color = "4F90C1"
34 elif node.key[1] == "y":
35 bg_color = "6DC066"
36 return OpenPyxlStyleHelper.get_style(bg_color=bg_color)
37
38 # create a presentation model
39 # note the OpenPyxlStyleHeloer function available in xlsx_styles module. But a return value of style function
40 # can be any dict whose keys are attributes of the OpenPyxl cell object.
41 presentation_model = build_presentation_model(
42 df=df,
43 index_style_func=index_style_func,
44 header_style_func=header_style_func,
45 index_name_func=lambda _: "Multi-Hierarchy Example",
46 )
47
48 # create a layout, which is usually a nested list of presentation models
49 layout = [presentation_model]
50
51 # render to xlsx
52 output_fp = os.path.join(tempfile.gettempdir(), "basic_example3.xlsx")
53 OpenPyxlCompositor.to_xlsx(layout=layout, output_fp=output_fp)
54
55
The above function gives us the xlsx file shown below. Note the colors used to render the indices and columns and review how the two functions, namely, index_style_function and header_style_function provide the colors based on the IndexNode attributes. You will notice the use of node.key in these functions to identify each cell uniquely.

Layouts¶
Apart from providing styling and formatting facilities, the library also provides a powerful way to layout multiple tables on one sheet. In this section we will look at some examples.
We will use the same presentation model from basic_example2(). We will layout the presentation models with different layouts.
1import os
2import tempfile
3
4import pandas as pd
5
6from table_compositor.table_compositor import build_presentation_model
7from table_compositor.xlsx_styles import OpenPyxlStyleHelper
8
9# There are equivalent classes for using xlsxwriter library. Namely,
10# XlsxWriterCompositor and XlsxWriterStyleHelper
11from table_compositor.xlsx_writer import OpenPyxlCompositor
12
1def layout_example1():
2
3 df = pd.DataFrame(
4 dict(a=[10, 20, 30, 40, 50], b=[0.1, 0.9, 0.2, 0.6, 0.3]), index=[1, 2, 3, 4, 5]
5 )
6
7 def style_func(idx, col):
8 if col == "b":
9 return OpenPyxlStyleHelper.get_style(number_format="0.00%")
10 else:
11 # for 'a' we do dollar format
12 return OpenPyxlStyleHelper.get_style(number_format="$#,##.00")
13
14 # create a presentation model
15 # note the OpenPyxlStyleHeloer function available in xlsx_styles module. But a return value of style function
16 # can be any dict whose keys are attributes of the OpenPyxl cell object.
17 presentation_model = build_presentation_model(
18 df=df,
19 data_value_func=lambda idx, col: df.loc[idx, col] * 10
20 if col == "a"
21 else df.loc[idx, col],
22 data_style_func=style_func,
23 header_value_func=lambda node: node.value.capitalize(),
24 header_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
25 index_value_func=lambda node: node.value * 100,
26 index_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
27 index_name_func=lambda _: "Basic Example",
28 index_name_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
29 )
30
31 # start_layout_code_1
32 # create a layout, which is usually a nested list of presentation models
33 layout = [[presentation_model], [[presentation_model], [presentation_model]]]
34
35 # render to xlsx
36 output_fp = os.path.join(tempfile.gettempdir(), "layout_vertical_example1.xlsx")
37 # the default value for orientation is 'vertical'
38 OpenPyxlCompositor.to_xlsx(
39 layout=layout, output_fp=output_fp, orientation="vertical"
40 )
41
42 output_fp = os.path.join(tempfile.gettempdir(), "layout_horizontal_example1.xlsx")
43 OpenPyxlCompositor.to_xlsx(
44 layout=layout, output_fp=output_fp, orientation="horizontal"
45 )
46 print("Writing xlsx file=", output_fp)
47
48 # mutiple nesting
49 layout_complex = [
50 presentation_model,
51 [presentation_model, [presentation_model, presentation_model]],
52 ]
53
54 output_fp = os.path.join(tempfile.gettempdir(), "layout_complex_example1.xlsx")
55 OpenPyxlCompositor.to_xlsx(
56 layout=layout_complex, output_fp=output_fp, orientation="vertical"
57 )
58 print("Writing xlsx file=", output_fp)
59 # end_layout_code_1
60
61
In the preceding example, we create two layouts. On line 28, we have a layout defined and then rendered to two files with different orientations.
When the orientation is vertical, then each item (presentation_model) in the list is layed out vertically. The orientation flips between vertical and horizontal for every nested listed that is encountered. In this example, you will notice that since the second item in the outer list is a list, the two presentation models in the inner list are rendered side-by-side (i.e. with horizontal orientation)

When the value of orientation argument is changed to horizontal, the renderer renders the outerlist horizontally and flips the orientation of inner lists to vertical. The second output is show below.

Example of XLSX Styles¶
In the preceding examples, we have used the functions provided by xslx_styles.OpenPyxlStyleHelper to return the required style dictionary. Some examples of style dictionaries that can be returned by functions returning styles are provided below for reference. For details on how to build style attributes refer to the openpyxl documentation.
Style with background color¶
from openpyxl.styles import PatternFill
from openpyxl.styles import fills
fill = PatternFill(fgColor=Color('4f81BD'), patternType=fills.FILL_SOLID)
# note that we return a dict, whose key = `fill` which is an
# attrbute of `cell` object in `openpyxl`
style = dict(fill=fill)
Style with percentage formatting¶
number_format = '0.00%'
fill = PatternFill(fgColor=Color('4f81BD'), patternType=fills.FILL_SOLID)
# note that we return a dict, whose key = `number_format` which is an
# attrbute of `cell` object in `openpyxl`
style = dict(number_format=number_format)
Style with alignment and fonts¶
from openpyxl.styles import Alignment
from openpyxl.styles import Font
font=Font(bold=True, color='FFFFFF')
# note that we return a dict, whose key = `number_format` which is an
# attrbute of `cell` object in `openpyxl`
style = dict(alignment=Alignment(horizontal=center, font=font)
Using a different XLXS Writer Engine¶
Note that if xlsxwriter library is uses, the keys in the dictionary returned by the callback funcs should match the keys required to build the Format object declared in the xlsxwriter library. Some examples of these keys can be found in xlsx_styles.XlsxWriterStyleHelper class.
Example of HTML Styles¶
In the Basic section, we only saw examples of how to render dataframes to a xlsx format. The same setup can be used to render dataframes to HTML using the html_writer.HTMLWriter.to_html function. The only thing that has to be changed is the style attributes that are being returned. We want our style providing functions to return style attributes that can be inlined into the style attribute of a <td> or <th> tag.
Some examples of style dictionaries that can be return by functions returning styles are provided below for reference.
Style for headers¶
style = dict(
text_align='center',
background_color='#4F81BD',
color='#FFFFFF',
font_weight='bold',
white_space='pre',
padding='10px',
border=1)
Style for cell holding numeric values¶
numeric_style = dict(
text_align='right',
background_color='#FFFFFF',
color='#000000',
font_weight='normal',
white_space='pre',
padding='10px',
border=None)
Style using the html_styles.td_style object¶
style = td_style(
text_align='center',
background_color='#4F81BD',
color='#FFFFFF',
font_weight='bold',
white_space='pre',
padding='10px',
border=1)
XLSX Examples¶
This page provides a list of examples that demonstrate rendering xlsx output from the given dataframe. Each example is self-contained inside a class. We have some helper functions to provide the data we need for this examples
All examples listed in this section use OpenPyxlStyleHelper and OpenPyxlCompositor. To use the xlsxwriter library replace these with XlsxWriterStyleHelper and XlsxWriterCompositor respectively.
If the callback funcs do not use the *StyleHelpers and return their own Style objects then the objects returned should be compatible with the value engine provided to the engine attribute.
Examples of relevant style objects can be found respective documentations for the engine being used.
Helper Functions (Data loading routines)¶
1import tempfile
2import webbrowser
3import zipfile
4
5import pandas as pd
6import requests
7
8import table_compositor.table_compositor as tc
9import table_compositor.xlsx_styles as xlsstyle
10import table_compositor.xlsx_writer as xlsxw
11
1# code snippet adapted from http://function-pipe.readthedocs.io/en/latest/usage_df.html
2# source url
3URL_NAMES = "https://www.ssa.gov/oact/babynames/names.zip"
4ZIP_NAME = "names.zip"
5
6
7def load_names_data():
8 fp = os.path.join(tempfile.gettempdir(), ZIP_NAME)
9 if not os.path.exists(fp):
10 r = requests.get(URL_NAMES)
11 with open(fp, "wb") as f:
12 f.write(r.content)
13
14 post = collections.OrderedDict()
15 with zipfile.ZipFile(fp) as zf:
16 # get ZipInfo instances
17 for zi in sorted(zf.infolist(), key=lambda zi: zi.filename):
18 fn = zi.filename
19 if fn.startswith("yob"):
20 year = int(fn[3:7])
21 df = pd.read_csv(
22 zf.open(zi), header=None, names=("name", "gender", "count")
23 )
24 df["year"] = year
25 post[year] = df
26
27 df = pd.concat(post.values())
28 df.set_index("name", inplace=True, drop=True)
29 return df
30
31
32def sample_names_data():
33 df = load_names_data()
34 df = df[(df["year"] == 2015) & (df["count"] > 1000)]
35 return df.sample(100, random_state=0).sort_values("count")
36
37
38def top_names_for_year(year=2015, gender="F", top_n=5):
39 df = load_names_data()
40 df = df[(df["year"] == year) & (df["gender"] == gender)]
41 df = df.sort_values("count")[:top_n]
42 return df
43
44
Example 1 - DataFrame with default styles¶
Demonstrates converting dataframe into html format with default styles.
1class XLSXExample1:
2 """
3 Demonstrates rendering a simple dataframe to a xlsx file
4 using the default styles
5 """
6
7 @classmethod
8 def render_xlsx(cls):
9 """
10 Render the df to a xlsx file.
11 """
12
13 # load data
14 df = sample_names_data()
15 # build presentation model
16 pm = tc.build_presentation_model(df=df, output_format="xlsx")
17
18 # render to xlsx
19 tempdir = tempfile.gettempdir()
20 fp = os.path.join(tempdir, "example1.xlsx")
21 layout = [pm]
22 print("Writing to " + fp)
23 xlsxw.OpenPyxlCompositor.to_xlsx(layout=layout, output_fp=fp)
24
25

Example 2 - DataFrame with custom styles¶
In this example, we format the different components of dataframe with various styling attributes
1class XLSXExample2:
2 """
3 Demonstrates using call-backs that help set the display and style
4 properties of each cell in the xlsx sheet.
5 """
6
7 @staticmethod
8 def data_value_func(df):
9 def _inner(idx, col):
10 if col == "gender":
11 if df.loc[idx, col] == "F":
12 return "Female"
13 return "Male"
14 return df.loc[idx, col]
15
16 return _inner
17
18 @staticmethod
19 def data_style_func(df):
20 def _inner(idx, col):
21 bg_color = None
22 number_format = "General"
23 if col == "count":
24 number_format = "#,##0"
25 if df.loc[idx, "gender"] == "F":
26 bg_color = "bbdef8"
27 else:
28 bg_color = "e3f2fd"
29 return xlsstyle.OpenPyxlStyleHelper.get_style(
30 bg_color=bg_color, number_format=number_format
31 )
32
33 return _inner
34
35 @staticmethod
36 def index_name_value_func(value):
37 return value.capitalize()
38
39 @staticmethod
40 def index_name_style_func(value):
41 return xlsstyle.OpenPyxlStyleHelper.default_header_style()
42
43 @staticmethod
44 def header_value_func(node):
45 return node.value.capitalize()
46
47 @staticmethod
48 def header_style_func(node):
49 return xlsstyle.OpenPyxlStyleHelper.default_header_style()
50
51 @staticmethod
52 def index_value_func(node):
53 return node.value.capitalize()
54
55 @staticmethod
56 def index_style_func(df):
57 def _inner(node):
58 bg_color = None
59 if df.loc[node.value, "gender"] == "F":
60 bg_color = "bbdef8"
61 else:
62 bg_color = "e3f2fd"
63 return xlsstyle.OpenPyxlStyleHelper.get_style(bg_color=bg_color)
64
65 return _inner
66
67 @classmethod
68 def render_xlsx(cls):
69 # load data
70 df = sample_names_data()
71 # build presentation model
72 klass_ = XLSXExample2
73 pm = tc.build_presentation_model(
74 df=df,
75 output_format="xlsx",
76 data_value_func=klass_.data_value_func(df),
77 data_style_func=klass_.data_style_func(df),
78 header_value_func=klass_.header_value_func,
79 header_style_func=klass_.header_style_func,
80 index_style_func=klass_.index_style_func(df),
81 index_value_func=klass_.index_value_func,
82 index_name_style_func=klass_.index_name_style_func,
83 index_name_func=klass_.index_name_value_func,
84 )
85
86 # render to xlsx
87 tempdir = tempfile.gettempdir()
88 fp = os.path.join(tempdir, "example2.xlsx")
89 layout = [pm]
90 print("Writing to " + fp)
91 xlsxw.OpenPyxlCompositor.to_xlsx(layout=layout, output_fp=fp)
92
93

Example 3 - Simple DataFrame with Layouts¶
Demonstrates rendering multi-dataframes in one worksheet along with common functions for styling
1class XLSXExample3:
2 """
3 Demonstrates using call-backs and also rendering multiple tables to single
4 worksheet.
5 """
6
7 @staticmethod
8 def data_value_func(df):
9 def _inner(idx, col):
10 if col == "gender":
11 if df.loc[idx, col] == "F":
12 return "Female"
13 return "Male"
14 return df.loc[idx, col]
15
16 return _inner
17
18 @staticmethod
19 def data_style_func(df):
20 def _inner(idx, col):
21 bg_color = None
22 number_format = "General"
23 if col == "count":
24 number_format = "#,##0"
25 if df.loc[idx, "gender"] == "F":
26 bg_color = "bbdef8"
27 else:
28 bg_color = "e3f2fd"
29 return xlsstyle.OpenPyxlStyleHelper.get_style(
30 bg_color=bg_color, number_format=number_format
31 )
32
33 return _inner
34
35 @staticmethod
36 def index_name_value_func(value):
37 return value.capitalize()
38
39 @staticmethod
40 def index_name_style_func(value):
41 return xlsstyle.OpenPyxlStyleHelper.default_header_style()
42
43 @staticmethod
44 def header_value_func(node):
45 return node.value.capitalize()
46
47 @staticmethod
48 def header_style_func(node):
49 return xlsstyle.OpenPyxlStyleHelper.default_header_style()
50
51 @staticmethod
52 def index_value_func(node):
53 return node.value.capitalize()
54
55 @staticmethod
56 def index_style_func(df):
57 def _inner(node):
58 bg_color = None
59 if df.loc[node.value, "gender"] == "F":
60 bg_color = "bbdef8"
61 else:
62 bg_color = "e3f2fd"
63 return xlsstyle.OpenPyxlStyleHelper.get_style(bg_color=bg_color)
64
65 return _inner
66
67 @classmethod
68 def render_xlsx(cls):
69 # Prepare first data frame (same as in render_xlsx)
70 df = sample_names_data()
71 # build presentation model
72 klass_ = XLSXExample3
73 pm_all = tc.build_presentation_model(
74 df=df,
75 output_format="xlsx",
76 data_value_func=klass_.data_value_func(df),
77 data_style_func=klass_.data_style_func(df),
78 header_value_func=klass_.header_value_func,
79 header_style_func=klass_.header_style_func,
80 index_style_func=klass_.index_style_func(df),
81 index_value_func=klass_.index_value_func,
82 index_name_style_func=klass_.index_name_style_func,
83 index_name_func=klass_.index_name_value_func,
84 )
85
86 male_df = top_names_for_year(gender="M")
87 pm_top_male = tc.build_presentation_model(
88 df=male_df,
89 output_format="xlsx",
90 data_value_func=klass_.data_value_func(male_df),
91 data_style_func=klass_.data_style_func(male_df),
92 header_value_func=klass_.header_value_func,
93 header_style_func=klass_.header_style_func,
94 index_style_func=klass_.index_style_func(male_df),
95 index_value_func=klass_.index_value_func,
96 index_name_style_func=klass_.index_name_style_func,
97 index_name_func=klass_.index_name_value_func,
98 )
99
100 female_df = top_names_for_year(gender="F")
101 pm_top_female = tc.build_presentation_model(
102 df=female_df,
103 output_format="xlsx",
104 data_value_func=klass_.data_value_func(female_df),
105 data_style_func=klass_.data_style_func(female_df),
106 header_value_func=klass_.header_value_func,
107 header_style_func=klass_.header_style_func,
108 index_style_func=klass_.index_style_func(female_df),
109 index_value_func=klass_.index_value_func,
110 index_name_style_func=klass_.index_name_style_func,
111 index_name_func=klass_.index_name_value_func,
112 )
113
114 layout = [pm_all, [pm_top_female, pm_top_male]]
115 # render to xlsx
116 tempdir = tempfile.gettempdir()
117 fp = os.path.join(tempdir, "example3.xlsx")
118 print("Writing to " + fp)
119 xlsxw.OpenPyxlCompositor.to_xlsx(
120 layout=layout, output_fp=fp, orientation="horizontal"
121 )
122
123

Example 4 - DataFrames with Multi-hierarchical columns and indices¶
Demonstrates rendering dataframes with multi-hierarchical indices and mult-hierarchical columns
1class XLSXExample4:
2 """
3 Demonstrate styling and rendering of multi-hierarchical indexed dataframe
4 into a xlsx file.
5 """
6
7 @staticmethod
8 def data_style_func(df):
9 def _inner(idx, col):
10 bg_color = None
11 number_format = "General"
12 if col == "count":
13 number_format = "#,##0"
14 if idx[1] == "F":
15 bg_color = "bbdef8"
16 else:
17 bg_color = "e3f2fd"
18 return xlsstyle.OpenPyxlStyleHelper.get_style(
19 bg_color=bg_color, number_format=number_format
20 )
21
22 return _inner
23
24 @staticmethod
25 def index_name_value_func(value):
26 return "Max By Year"
27
28 @staticmethod
29 def index_name_style_func(value):
30 return xlsstyle.OpenPyxlStyleHelper.default_header_style()
31
32 @staticmethod
33 def header_value_func(node):
34 return node.value.capitalize()
35
36 @staticmethod
37 def header_style_func(node):
38 return xlsstyle.OpenPyxlStyleHelper.default_header_style()
39
40 @staticmethod
41 def index_value_func(node):
42 if isinstance(node.value, str):
43 return node.value.capitalize()
44 return node.value
45
46 @staticmethod
47 def index_style_func(df):
48 def _inner(node):
49 bg_color = None
50 if len(node.key) == 1:
51 bg_color = "4f81bd"
52 elif node.key[1] == "F":
53 bg_color = "bbdef8"
54 else:
55 bg_color = "e3f2fd"
56 return xlsstyle.OpenPyxlStyleHelper.get_style(bg_color=bg_color)
57
58 return _inner
59
60 @classmethod
61 def render_xlsx(cls):
62
63 # Prepare first data frame (same as in render_xlsx)
64 data_df = load_names_data()
65 data_df = data_df[data_df["year"] >= 2000]
66 g = data_df.groupby(("year", "gender"))
67 df = g.max()
68
69 klass_ = cls
70 pm = tc.build_presentation_model(
71 df=df,
72 output_format="xlsx",
73 # data_value_func=None, # use default
74 data_style_func=klass_.data_style_func(df),
75 header_value_func=klass_.header_value_func,
76 header_style_func=klass_.header_style_func,
77 index_style_func=klass_.index_style_func(df),
78 index_value_func=klass_.index_value_func,
79 index_name_style_func=klass_.index_name_style_func,
80 index_name_func=klass_.index_name_value_func,
81 )
82
83 layout = [pm]
84 # render to xlsx
85 tempdir = tempfile.gettempdir()
86 fp = os.path.join(tempdir, "example4.xlsx")
87 print("Writing to " + fp)
88 xlsxw.OpenPyxlCompositor.to_xlsx(
89 layout=layout, output_fp=fp, orientation="horizontal"
90 )
91
92

HTML Examples¶
This page provides a list of examples that demonstrate rendering html tables from the given dataframe. Each example is self-contained inside a class. We have some helper functions to provide the data we need for this examples
Helper Functions (Data loading routines)¶
1import tempfile
2import webbrowser
3import zipfile
4
5import pandas as pd
6import requests
7
8import table_compositor.html_styles as html_style
9import table_compositor.html_writer as htmlw
10import table_compositor.table_compositor as tc
11
1# code snippet adapted from http://function-pipe.readthedocs.io/en/latest/usage_df.html
2# source url
3URL_NAMES = "https://www.ssa.gov/oact/babynames/names.zip"
4ZIP_NAME = "names.zip"
5
6
7def load_names_data():
8 fp = os.path.join(tempfile.gettempdir(), ZIP_NAME)
9 if not os.path.exists(fp):
10 r = requests.get(URL_NAMES)
11 with open(fp, "wb") as f:
12 f.write(r.content)
13
14 post = collections.OrderedDict()
15 with zipfile.ZipFile(fp) as zf:
16 # get ZipInfo instances
17 for zi in sorted(zf.infolist(), key=lambda zi: zi.filename):
18 fn = zi.filename
19 if fn.startswith("yob"):
20 year = int(fn[3:7])
21 df = pd.read_csv(
22 zf.open(zi), header=None, names=("name", "gender", "count")
23 )
24 df["year"] = year
25 post[year] = df
26
27 df = pd.concat(post.values())
28 df.set_index("name", inplace=True, drop=True)
29 return df
30
31
32def sample_names_data():
33 df = load_names_data()
34 df = df[(df["year"] == 2015) & (df["count"] > 1000)]
35 return df.sample(50, random_state=0).sort_values("count")
36
37
38def top_names_for_year(year=2015, gender="F", top_n=5):
39 df = load_names_data()
40 df = df[(df["year"] == year) & (df["gender"] == gender)]
41 df = df.sort_values("count")[:top_n]
42 return df
43
44
Example 1 - DataFrame with default styles¶
Demonstrates converting dataframe into html format with default styles.
1class HTMLExample1:
2 """
3 Demonstrate rendering of a simple dataframe into html
4 """
5
6 @classmethod
7 def render_html(cls):
8
9 # load data
10 df = load_names_data()
11 df = df[:100]
12
13 # build presentation model
14 pm = tc.build_presentation_model(df=df, output_format="html")
15
16 # render to xlsx
17 tempdir = tempfile.gettempdir()
18 fp = os.path.join(tempdir, "example_1.html")
19 layout = [pm]
20 print("Writing to " + fp)
21 html = htmlw.HTMLWriter.to_html(layout, border=1)
22 output_fp = os.path.join(tempfile.gettempdir(), "example1.html")
23 with open(output_fp, "w") as f:
24 f.write(html)
25
26

Example 2 - DataFrame with custom styles¶
In this example, we format the different components of dataframe with various styling attributes
1class HTMLExample2:
2 """
3 Demonstrate rendering of a simple dataframe into html
4 """
5
6 @staticmethod
7 def data_value_func(df):
8 def _inner(idx, col):
9 if col == "gender":
10 if df.loc[idx, col] == "F":
11 return "Female"
12 return "Male"
13 return df.loc[idx, col]
14
15 return _inner
16
17 @staticmethod
18 def data_style_func(df):
19 def _inner(idx, col):
20 color = "#FFFFFF"
21 text_align = "left"
22 if col == "count":
23 text_align = "right"
24 if df.loc[idx, "gender"] == "F":
25 color = "#bbdef8"
26 else:
27 color = "#e3f2fd"
28 return html_style.td_style(
29 text_align=text_align,
30 background_color=color,
31 color="#000000",
32 font_weight="normal",
33 white_space="pre",
34 padding="10px",
35 border=None,
36 )
37
38 return _inner
39
40 @staticmethod
41 def index_name_value_func(value):
42 return value.capitalize()
43
44 @staticmethod
45 def header_value_func(node):
46 return node.value.capitalize()
47
48 @staticmethod
49 def header_style_func(node):
50 return html_style.td_style(
51 text_align="center",
52 background_color="#4F81BD",
53 color="#FFFFFF",
54 font_weight="bold",
55 white_space="pre",
56 padding="10px",
57 border=1,
58 )
59
60 @staticmethod
61 def index_value_func(node):
62 return node.value.capitalize()
63
64 @staticmethod
65 def index_style_func(node):
66 return html_style.td_style(
67 text_align="center",
68 background_color="#4F81BD",
69 color="#FFFFFF",
70 font_weight="bold",
71 white_space="pre",
72 padding="10px",
73 border=1,
74 )
75
76 @classmethod
77 def render_html(cls):
78 # load data
79 df = sample_names_data()
80 # build presentation model
81 klass_ = HTMLExample2
82 pm = tc.build_presentation_model(
83 df=df,
84 output_format="html",
85 data_value_func=klass_.data_value_func(df),
86 data_style_func=klass_.data_style_func(df),
87 header_value_func=klass_.header_value_func,
88 header_style_func=klass_.header_style_func,
89 index_style_func=klass_.index_style_func,
90 index_value_func=klass_.index_value_func,
91 index_name_func=klass_.index_name_value_func,
92 )
93
94 layout = [pm]
95 html = htmlw.HTMLWriter.to_html(layout, border=1)
96 output_fp = os.path.join(tempfile.gettempdir(), "example2.html")
97 print("Writing to =", output_fp)
98 with open(output_fp, "w") as f:
99 f.write(html)
100
101

Example 3 - Simple DataFrame with Layouts¶
Demonstrates rendering dataframes with multi-hierarchical indices and mult-hierarchical columns
1class HTMLExample3:
2 """
3 Demonstrate styling and rendering of multiple multi-hierarchical indexed dataframe
4 into a html file
5 """
6
7 @staticmethod
8 def data_value_func(df):
9 def _inner(idx, col):
10 if col == "gender":
11 if df.loc[idx, col] == "F":
12 return "Female"
13 return "Male"
14 return df.loc[idx, col]
15
16 return _inner
17
18 @staticmethod
19 def data_style_func(df):
20 def _inner(idx, col):
21 color = "#FFFFFF"
22 text_align = "left"
23 if col == "count":
24 text_align = "right"
25 if df.loc[idx, "gender"] == "F":
26 color = "#bbdef8"
27 else:
28 color = "#e3f2fd"
29 return html_style.td_style(
30 text_align=text_align,
31 background_color=color,
32 color="#000000",
33 font_weight="normal",
34 white_space="pre",
35 padding="10px",
36 border=None,
37 )
38
39 return _inner
40
41 @staticmethod
42 def index_name_value_func(value):
43 return "Max By Year"
44
45 @staticmethod
46 def header_value_func(node):
47 return node.value.capitalize()
48
49 @staticmethod
50 def header_style_func(node):
51 return html_style.td_style(
52 text_align="center",
53 background_color="#4F81BD",
54 color="#FFFFFF",
55 font_weight="bold",
56 white_space="pre",
57 padding="10px",
58 border=1,
59 )
60
61 @staticmethod
62 def index_value_func(node):
63 if isinstance(node.value, str):
64 return node.value.capitalize()
65 return node.value
66
67 @staticmethod
68 def index_style_func(node):
69 return html_style.td_style(
70 text_align="center",
71 background_color="#4F81BD",
72 color="#FFFFFF",
73 font_weight="bold",
74 white_space="pre",
75 padding="10px",
76 border=1,
77 )
78
79 @classmethod
80 def render_html(cls):
81
82 # Prepare first data frame (same as in render_xlsx)
83 df = sample_names_data()
84 # build presentation model
85 klass_ = HTMLExample4
86 pm_all = tc.build_presentation_model(
87 df=df,
88 output_format="html",
89 data_value_func=klass_.data_value_func(df),
90 data_style_func=klass_.data_style_func(df),
91 header_value_func=klass_.header_value_func,
92 header_style_func=klass_.header_style_func,
93 index_style_func=klass_.index_style_func,
94 index_value_func=klass_.index_value_func,
95 index_name_func=lambda _: "Sample Data",
96 )
97
98 male_df = top_names_for_year(gender="M")
99 pm_top_male = tc.build_presentation_model(
100 df=male_df,
101 output_format="html",
102 data_value_func=klass_.data_value_func(male_df),
103 data_style_func=klass_.data_style_func(male_df),
104 header_value_func=klass_.header_value_func,
105 header_style_func=klass_.header_style_func,
106 index_style_func=klass_.index_style_func,
107 index_value_func=klass_.index_value_func,
108 index_name_func=lambda _: "Max by Year",
109 )
110
111 female_df = top_names_for_year(gender="F")
112 pm_top_female = tc.build_presentation_model(
113 df=female_df,
114 output_format="html",
115 data_value_func=klass_.data_value_func(female_df),
116 data_style_func=klass_.data_style_func(female_df),
117 header_value_func=klass_.header_value_func,
118 header_style_func=klass_.header_style_func,
119 index_style_func=klass_.index_style_func,
120 index_value_func=klass_.index_value_func,
121 index_name_func=lambda _: "Max by Year",
122 )
123
124 layout = [pm_all, [pm_top_female, pm_top_male]]
125 # render to xlsx
126 html = htmlw.HTMLWriter.to_html(layout, border=1, orientation="horizontal")
127 output_fp = os.path.join(tempfile.gettempdir(), "example3.html")
128 print("Writing to =", output_fp)
129 with open(output_fp, "w") as f:
130 f.write(html)
131
132

Example 4 - DataFrames with Multi-hierarchical columns and indices¶
Demonstrates rendering dataframes with multi-hierarchical indices and mult-hierarchical columns
1class HTMLExample4:
2 """
3 Demonstrate styling and rendering of multi-hierarchical indexed dataframe
4 into a html file.
5 """
6
7 @staticmethod
8 def data_value_func(df):
9 def _inner(idx, col):
10 if col == "gender":
11 if df.loc[idx, col] == "F":
12 return "Female"
13 return "Male"
14 return df.loc[idx, col]
15
16 return _inner
17
18 @staticmethod
19 def data_style_func(df):
20 def _inner(idx, col):
21 color = "#FFFFFF"
22 text_align = "left"
23 if col == "count":
24 text_align = "right"
25 if idx[1] == "F":
26 color = "#bbdef8"
27 else:
28 color = "#e3f2fd"
29
30 return html_style.td_style(
31 text_align=text_align,
32 background_color=color,
33 color="#000000",
34 font_weight="normal",
35 white_space="pre",
36 padding="10px",
37 border=None,
38 )
39
40 return _inner
41
42 @staticmethod
43 def index_name_value_func(value):
44 return "Max By Year"
45
46 @staticmethod
47 def header_value_func(node):
48 return node.value.capitalize()
49
50 @staticmethod
51 def header_style_func(node):
52 return html_style.td_style(
53 text_align="center",
54 background_color="#4F81BD",
55 color="#FFFFFF",
56 font_weight="bold",
57 white_space="pre",
58 padding="10px",
59 border=1,
60 )
61
62 @staticmethod
63 def index_value_func(node):
64 if isinstance(node.value, str):
65 return node.value.capitalize()
66 return node.value
67
68 @staticmethod
69 def index_style_func(node):
70 return html_style.td_style(
71 text_align="center",
72 background_color="#4F81BD",
73 color="#FFFFFF",
74 font_weight="bold",
75 white_space="pre",
76 padding="10px",
77 border=1,
78 )
79
80 @classmethod
81 def render_html(cls):
82
83 # Prepare first data frame (same as in render_xlsx)
84 data_df = load_names_data()
85 data_df = data_df[data_df["year"] >= 2000]
86 g = data_df.groupby(("year", "gender"))
87 df = g.max()
88
89 klass_ = cls
90 pm = tc.build_presentation_model(
91 df=df,
92 output_format="html",
93 data_value_func=klass_.data_value_func(df),
94 data_style_func=klass_.data_style_func(df),
95 header_value_func=klass_.header_value_func,
96 header_style_func=klass_.header_style_func,
97 index_style_func=klass_.index_style_func,
98 index_value_func=klass_.index_value_func,
99 index_name_func=klass_.index_name_value_func,
100 )
101
102 layout = [pm]
103 # render to xlsx
104 html = htmlw.HTMLWriter.to_html(layout, border=1)
105 output_fp = os.path.join(tempfile.gettempdir(), "example4.html")
106 print("Writing to =", output_fp)
107 with open(output_fp, "w") as f:
108 f.write(html)
109
110

API¶
Building the presentation model¶
- table_compositor.table_compositor.build_presentation_model(*, df, output_format='xlsx', data_value_func=None, column_style_func=None, data_style_func=None, header_style_func=None, header_value_func=None, index_style_func=None, index_value_func=None, index_name_func=None, index_name_style_func=None, engine='openpyxl', **kwargs)[source]¶
Construct and return the presentation model that will be used while rendering to html/xlsx formats. The returned object has all the information required to render the tables in the requested format. The details of the object is transparent to the caller. It is only exposed for certain advanced operations.
- Parameters
df – The dataframe representation of the table. The shape of the dataframe closely resembles the table that will be rendered in the requested format.
output_format – ‘html’ or ‘xlsx’
data_value_func – example: lambda idx, col: df.loc[idx, col], assuming df is in the closure. This can be None, if no data transformation is required to the values already present in the source df
column_style_func – the function can substitute the data_style_func, if the same style can be applied for the whole column. This argument should be prefered over the data_style_func argument. Using this option provides better performance since the fewer objects will be created internally and fewer callbacks are made to this function when compared to data_style_func.This argument only applies to the data contained in the dataframe and not the cell where the headers are rendered. For fine grained control at cell level, the data_style_func argument can be used. For more information on return values of this function, refer to the documentation for data_style_func argument.
data_style_func – used to provide style at the cell level. Example: lambda idx, col: return dict(font=Font(…)), where Font is the openpyxl object and font is the attr available in the cell instance of openpyxl. For xlsx, the keys in the dict are the attrs of the cell object in openpyxl and the values correspond to the value of that attribute. Example are found in xlsx_styles module. For html, the key-value pairs are any values that go into to the style attribute of a td, th cell in html. Examples are found in html_styles module. example: dict(background-color=’#F8F8F8’). When performance becomes an issue, and cell level control is not needed, it is recommended to use the column_style_func argument rathat than this argument. If the prefered engine is XlswWriter, then the style dictionary returned should have key/values compatible with the Format object declarted in the XlsxWriter library. A reference can be found in ``xlsx_styles.XlsxWriterStyleHelper` class
header_value_func – func that takes a object of type IndexNode. The IndexNode contains the attributes that refer to the header being rendered. The returned value from this function is displayed in place of the header in the dataframe at the location. The two properties available on the IndexNode object are value and key. The key is useful to identify the exact index and level in context while working with multi-hierarchical columns.
header_style_func – func that takes a object of type IndexNode. The return value of this function is similar to data_style_func.
index_value_func – func that takes a object of type IndexNode. The IndexNode contains the attributes that refer to the index being rendered. The returned value from this function is displayed in place of the index in the dataframe at the location.
index_style_func – func that takes a object of type IndexNode. The return value of this function is similar to data_style_func.
index_name_func – func that returns a string for index name (value to be displayed on top-left corner, above the index column)
index_name_style – the style value same as data_style_func that will be used to style the cell
engine – required while building presentation model for xlsx. Argument ignored for HTML rendering. This argument is used to provide the default callback style functions, where the style dictionary returned by the callback functions should be compatible with the engine being used.
kwargs –
‘hide_index’ - if True, then hide the index column, default=False
’hide_header, - if True, then hide the header, default=False
’use_convert’ - if True, do some conversions from dataframe values to values excel can understand for example np.NaN are converted to NaN strings
- Returns
A presentation model, to be used to create layout and provide the layout to the html or xlsx writers.
About the callback functions provided as arguments:
Note that callback function provided as arguments to this function are provided with either a tuple of index, col arguments are some information regarding the index or headers being rendered. Therefore, a common pattern would be to capture the dataframe being rendered in a closure of this callback func before passing them as arugments.
For example:
df = pd.DataFrame(dict(a=[1, 2, 3]))
- def data_value_func():
- def _inner(idx, col):
return df.loc[idx, col] * 10.3
return _inner
pm = build_presentation_model(df=df, data_value_func=data_value_func())
Rendering to XLSX¶
Rendering to HTML¶
- class table_compositor.html_writer.HTMLWriter[source]¶
- static to_html(layout, orientation='vertical', **kwargs)[source]¶
Take a layout which contains a list of presentation models builts using the build_presentation_model function.
- Parameters
layout – An nested list of presentation_models, examples: [presentation_model] or [presentation_model1, presentation_mode2]. Not all nested layouts work very well in HTML, currently
orientation – if vertical, the top level presentation model elements are rendered vertically, and for every nested level the orientation is flipped. if horizontal, then the behavior is inverse
kwargs – all key-value pairs available in kwargs are directly set as value of the style attribute of table tag. example dict(backgroud-color=’#FF88FF’), is used as <table style=’background-color:#FF88FF’>..</table>
- Returns
Return a HTML formatted string. The outermost tag of the returned string is the <table>
Helper XLSX Styles¶
- class table_compositor.xlsx_styles.OpenPyxlStyleHelper[source]¶
- static default_header_style(*, alignment='center', font=<openpyxl.styles.fonts.Font object> Parameters: name=None, charset=None, family=None, b=True, i=False, strike=None, outline=None, shadow=None, condense=None, color=<openpyxl.styles.colors.Color object> Parameters: rgb='00FFFFFF', indexed=None, auto=None, theme=None, tint=0.0, type='rgb', extend=None, sz=None, u=None, vertAlign=None, scheme=None, bgColor='4F81BD', border=<object object>)[source]¶
Provides styles for default headers for OpenPyxl engine
- Parameters
alignment – ‘center’, ‘left’, ‘right’ used for horizontal alignment
font – an openpyxl.Font instance
bgColor – hex color that will be used as background color in the fill pattern
border – an openpyxl.Border instance, defaults to thin white border
- Returns
A dict of key-values pairs, where each key is a attr of the cell object in openyxl and value is valid value of that attr.
- static get_style(number_format='General', bg_color=None, border=None, font=None)[source]¶
Helper method to return a openpyxl Style
- Parameters
number_format – an xlsx compatibale number format string
bg_color – hex color that will be used as background color in the fill pattern
border – an openpyxl.Border instance, defaults to thin white border
- Returns
A dict of key-values pairs, where each key is a attr of the cell object in openyxl and value is valid value of that attr.
- class table_compositor.xlsx_styles.XlsxWriterStyleHelper[source]¶
Class provides style objects for XlsxWriter library uses to render xlsx files
- static default_header_style(*, number_format='General', alignment='center', font=None, bgColor='#4F81BD', border=<object object>)[source]¶
Provides styles for default headers for XlsxWriter engine
- Parameters
alignment – ‘center’, ‘left’, ‘right’ used for horizontal alignment
font – an openpyxl.Font instance
bgColor – hex color that will be used as background color in the fill pattern
border – an openpyxl.Border instance, defaults to thin white border
- Returns
A dict of key-values pairs, where each key is a attr of the cell object in openyxl and value is valid value of that attr.
- static get_style(number_format='General', bg_color=None, border=<object object>)[source]¶
Helper method to return Style dictionary for XlsxWriter engine
- Parameters
number_format – an xlsx compatibale number format string
bg_color – hex color that will be used as background color in the fill pattern
border – an openpyxl.Border instance, defaults to thin white border
- Returns
A dict of key-values pairs, where each key/value is compatible with the Format object in XlsxWriter library.
- class table_compositor.xlsx_styles.XLSXWriterDefaults[source]¶
Class provides defaults callback funcs that can be used while calling the build_presentation_model.
- static data_style_func(df)[source]¶
Default value that can be used as callback for data_style_func
- Parameters
df – the dataframe that will be used to build the presentation model
- Returns
a function table takes idx, col as arguments and returns a openpyxl compatible style dictionary
- static data_value_func(df)[source]¶
Default value that can be used as callback for data_value_func
- Parameters
df – the dataframe that will be used to build the presentation model
- Returns
A function that takes idx, col as arguments and returns the df.loc[idx, col] value
- static header_style_func(df)[source]¶
Default value that can be used as callback for data_style_func
- Parameters
df – the dataframe that will be used to build the presentation model
- Returns
a function table takes node as arguments and returns a openpyxl compatible style dictionary
- static header_value_func(df)[source]¶
Default value that can be used as callback for data_header_func
- Parameters
df – the dataframe that will be used to build the presentation model
- Returns
a function table takes node as arguments and returns node.value
- static index_name_style_func(df)[source]¶
Default value that can be used as callback for index_name_style_func
- Parameters
df – the dataframe that will be used to build the presentation model
- Returns
a function table takes index.name as arguments and returns a openpyxl compatible style dictionary
- static index_name_value_func(df)[source]¶
Default value that can be used as callback for index_name_value_func
- Parameters
df – the dataframe that will be used to build the presentation model
- Returns
a function table takes index.name as arguments and returns index.name if not None, else ‘’
Helper HTML Styles¶
- class table_compositor.html_styles.HTMLWriterDefaults[source]¶
Class provides defaults callback funcs that can be used while calling the build_presentation_model.
- static data_style_func(df)[source]¶
Default value that can be used as callback for data_style_func
- Parameters
df – the dataframe that will be used to build the presentation model
- Returns
a function table takes idx, col as arguments and returns a dictionary of html style attributes
- static data_value_func(df, dollar_columns=None)[source]¶
Default value that can be used as callback for data_value_func
- Parameters
df – the dataframe that will be used to build the presentation model
- Returns
A function that takes idx, col as arguments and returns the df.loc[idx, col] value
- static header_style_func(df)[source]¶
Default value that can be used as callback for header_style_func
- Parameters
df – the dataframe that will be used to build the presentation model
- Returns
a function table takes node as argument and returns a dictionary of html style attributes
- static header_value_func(df)[source]¶
Default value that can be used as callback for header_value_func
- Parameters
df – the dataframe that will be used to build the presentation model
- Returns
A function that takes node as arguments and returns the node.value
- static index_name_style_func(df)[source]¶
Default value that can be used as callback for index_name_style_func
- Parameters
df – the dataframe that will be used to build the presentation model
- Returns
a function table takes index.name as argument and returns a dictionary of html style attributes
- static index_name_value_func(df)[source]¶
Default value that can be used as callback for index_name_value_func
- Parameters
df – the dataframe that will be used to build the presentation model
- Returns
A function that takes index.name as argument and return index.name if not None else ‘’
Code Used in documentation¶
Basic Usage¶
# start_imports
import os
import tempfile
import pandas as pd
from table_compositor.table_compositor import build_presentation_model
from table_compositor.xlsx_styles import OpenPyxlStyleHelper
# There are equivalent classes for using xlsxwriter library. Namely,
# XlsxWriterCompositor and XlsxWriterStyleHelper
from table_compositor.xlsx_writer import OpenPyxlCompositor
# end_imports
# start_basic_example_2
def basic_example2():
df = pd.DataFrame(
dict(a=[10, 20, 30, 40, 50], b=[0.1, 0.9, 0.2, 0.6, 0.3]), index=[1, 2, 3, 4, 5]
)
def style_func(idx, col):
if col == "b":
return OpenPyxlStyleHelper.get_style(number_format="0.00%")
else:
# for 'a' we do dollar format
return OpenPyxlStyleHelper.get_style(number_format="$#,##.00")
# create a presentation model
# note the OpenPyxlStyleHelper function available in xlsx_styles module. But a return value of style function
# can be any dict whose keys are attributes of the OpenPyxl cell object.
presentation_model = build_presentation_model(
df=df,
data_value_func=lambda idx, col: df.loc[idx, col] * 10
if col == "a"
else df.loc[idx, col],
data_style_func=style_func,
header_value_func=lambda node: node.value.capitalize(),
header_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
index_value_func=lambda node: node.value * 100,
index_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
index_name_func=lambda _: "Basic Example",
index_name_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
)
# create a layout, which is usually a nested list of presentation models
layout = [presentation_model]
# render to xlsx
output_fp = os.path.join(tempfile.gettempdir(), "basic_example2.xlsx")
OpenPyxlCompositor.to_xlsx(layout=layout, output_fp=output_fp)
# end_basic_example_2
# start_basic_example_3
def basic_example3():
df = pd.DataFrame(
dict(
a=[10, 20, 30, 40],
b=[0.1, 0.9, 0.2, 0.6],
d=[50, 60, 70, 80],
e=[200, 300, 400, 500],
)
)
df.columns = pd.MultiIndex.from_tuples(
[("A", "x"), ("A", "y"), ("B", "x"), ("B", "y")]
)
df.index = pd.MultiIndex.from_tuples([(1, 100), (1, 200), (2, 100), (2, 200)])
print(df)
def index_style_func(node):
# node.key here could be one of (1,), (1, 100), (2,), (2, 100), (2, 200)
bg_color = "FFFFFF"
if node.key == (1,) or node.key == (2,):
bg_color = "9E80B8"
elif node.key[1] == 100:
bg_color = "4F90C1"
elif node.key[1] == 200:
bg_color = "6DC066"
return OpenPyxlStyleHelper.get_style(bg_color=bg_color)
def header_style_func(node):
bg_color = "FFFFFF"
if node.key == ("A",) or node.key == ("B",):
bg_color = "9E80B8"
elif node.key[1] == "x":
bg_color = "4F90C1"
elif node.key[1] == "y":
bg_color = "6DC066"
return OpenPyxlStyleHelper.get_style(bg_color=bg_color)
# create a presentation model
# note the OpenPyxlStyleHeloer function available in xlsx_styles module. But a return value of style function
# can be any dict whose keys are attributes of the OpenPyxl cell object.
presentation_model = build_presentation_model(
df=df,
index_style_func=index_style_func,
header_style_func=header_style_func,
index_name_func=lambda _: "Multi-Hierarchy Example",
)
# create a layout, which is usually a nested list of presentation models
layout = [presentation_model]
# render to xlsx
output_fp = os.path.join(tempfile.gettempdir(), "basic_example3.xlsx")
OpenPyxlCompositor.to_xlsx(layout=layout, output_fp=output_fp)
# end_basic_example_3
# start_layout_example_1
def layout_example1():
df = pd.DataFrame(
dict(a=[10, 20, 30, 40, 50], b=[0.1, 0.9, 0.2, 0.6, 0.3]), index=[1, 2, 3, 4, 5]
)
def style_func(idx, col):
if col == "b":
return OpenPyxlStyleHelper.get_style(number_format="0.00%")
else:
# for 'a' we do dollar format
return OpenPyxlStyleHelper.get_style(number_format="$#,##.00")
# create a presentation model
# note the OpenPyxlStyleHeloer function available in xlsx_styles module. But a return value of style function
# can be any dict whose keys are attributes of the OpenPyxl cell object.
presentation_model = build_presentation_model(
df=df,
data_value_func=lambda idx, col: df.loc[idx, col] * 10
if col == "a"
else df.loc[idx, col],
data_style_func=style_func,
header_value_func=lambda node: node.value.capitalize(),
header_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
index_value_func=lambda node: node.value * 100,
index_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
index_name_func=lambda _: "Basic Example",
index_name_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
)
# start_layout_code_1
# create a layout, which is usually a nested list of presentation models
layout = [[presentation_model], [[presentation_model], [presentation_model]]]
# render to xlsx
output_fp = os.path.join(tempfile.gettempdir(), "layout_vertical_example1.xlsx")
# the default value for orientation is 'vertical'
OpenPyxlCompositor.to_xlsx(
layout=layout, output_fp=output_fp, orientation="vertical"
)
output_fp = os.path.join(tempfile.gettempdir(), "layout_horizontal_example1.xlsx")
OpenPyxlCompositor.to_xlsx(
layout=layout, output_fp=output_fp, orientation="horizontal"
)
print("Writing xlsx file=", output_fp)
# mutiple nesting
layout_complex = [
presentation_model,
[presentation_model, [presentation_model, presentation_model]],
]
output_fp = os.path.join(tempfile.gettempdir(), "layout_complex_example1.xlsx")
OpenPyxlCompositor.to_xlsx(
layout=layout_complex, output_fp=output_fp, orientation="vertical"
)
print("Writing xlsx file=", output_fp)
# end_layout_code_1
# end_layout_example_1
if __name__ == "__main__":
basic_example2()
basic_example3()
layout_example1()
XLSX Examples¶
"""
This module is referred to by the Sphinx documentation. If you need to run this
module, install table_compositor in an separate environment and then run this module
in that environment. This helps the imports find the modules in the right place
"""
import collections
import os
# start_imports
import tempfile
import webbrowser
import zipfile
import pandas as pd
import requests
import table_compositor.table_compositor as tc
import table_compositor.xlsx_styles as xlsstyle
import table_compositor.xlsx_writer as xlsxw
# end_imports
# start_data_routine
# code snippet adapted from http://function-pipe.readthedocs.io/en/latest/usage_df.html
# source url
URL_NAMES = "https://www.ssa.gov/oact/babynames/names.zip"
ZIP_NAME = "names.zip"
def load_names_data():
fp = os.path.join(tempfile.gettempdir(), ZIP_NAME)
if not os.path.exists(fp):
r = requests.get(URL_NAMES)
with open(fp, "wb") as f:
f.write(r.content)
post = collections.OrderedDict()
with zipfile.ZipFile(fp) as zf:
# get ZipInfo instances
for zi in sorted(zf.infolist(), key=lambda zi: zi.filename):
fn = zi.filename
if fn.startswith("yob"):
year = int(fn[3:7])
df = pd.read_csv(
zf.open(zi), header=None, names=("name", "gender", "count")
)
df["year"] = year
post[year] = df
df = pd.concat(post.values())
df.set_index("name", inplace=True, drop=True)
return df
def sample_names_data():
df = load_names_data()
df = df[(df["year"] == 2015) & (df["count"] > 1000)]
return df.sample(100, random_state=0).sort_values("count")
def top_names_for_year(year=2015, gender="F", top_n=5):
df = load_names_data()
df = df[(df["year"] == year) & (df["gender"] == gender)]
df = df.sort_values("count")[:top_n]
return df
# end_data_routine
# start_XLSXExample1
class XLSXExample1:
"""
Demonstrates rendering a simple dataframe to a xlsx file
using the default styles
"""
@classmethod
def render_xlsx(cls):
"""
Render the df to a xlsx file.
"""
# load data
df = sample_names_data()
# build presentation model
pm = tc.build_presentation_model(df=df, output_format="xlsx")
# render to xlsx
tempdir = tempfile.gettempdir()
fp = os.path.join(tempdir, "example1.xlsx")
layout = [pm]
print("Writing to " + fp)
xlsxw.OpenPyxlCompositor.to_xlsx(layout=layout, output_fp=fp)
# end_XLSXExample1
# start_XLSXExample2
class XLSXExample2:
"""
Demonstrates using call-backs that help set the display and style
properties of each cell in the xlsx sheet.
"""
@staticmethod
def data_value_func(df):
def _inner(idx, col):
if col == "gender":
if df.loc[idx, col] == "F":
return "Female"
return "Male"
return df.loc[idx, col]
return _inner
@staticmethod
def data_style_func(df):
def _inner(idx, col):
bg_color = None
number_format = "General"
if col == "count":
number_format = "#,##0"
if df.loc[idx, "gender"] == "F":
bg_color = "bbdef8"
else:
bg_color = "e3f2fd"
return xlsstyle.OpenPyxlStyleHelper.get_style(
bg_color=bg_color, number_format=number_format
)
return _inner
@staticmethod
def index_name_value_func(value):
return value.capitalize()
@staticmethod
def index_name_style_func(value):
return xlsstyle.OpenPyxlStyleHelper.default_header_style()
@staticmethod
def header_value_func(node):
return node.value.capitalize()
@staticmethod
def header_style_func(node):
return xlsstyle.OpenPyxlStyleHelper.default_header_style()
@staticmethod
def index_value_func(node):
return node.value.capitalize()
@staticmethod
def index_style_func(df):
def _inner(node):
bg_color = None
if df.loc[node.value, "gender"] == "F":
bg_color = "bbdef8"
else:
bg_color = "e3f2fd"
return xlsstyle.OpenPyxlStyleHelper.get_style(bg_color=bg_color)
return _inner
@classmethod
def render_xlsx(cls):
# load data
df = sample_names_data()
# build presentation model
klass_ = XLSXExample2
pm = tc.build_presentation_model(
df=df,
output_format="xlsx",
data_value_func=klass_.data_value_func(df),
data_style_func=klass_.data_style_func(df),
header_value_func=klass_.header_value_func,
header_style_func=klass_.header_style_func,
index_style_func=klass_.index_style_func(df),
index_value_func=klass_.index_value_func,
index_name_style_func=klass_.index_name_style_func,
index_name_func=klass_.index_name_value_func,
)
# render to xlsx
tempdir = tempfile.gettempdir()
fp = os.path.join(tempdir, "example2.xlsx")
layout = [pm]
print("Writing to " + fp)
xlsxw.OpenPyxlCompositor.to_xlsx(layout=layout, output_fp=fp)
# end_XLSXExample2
# start_XLSXExample3
class XLSXExample3:
"""
Demonstrates using call-backs and also rendering multiple tables to single
worksheet.
"""
@staticmethod
def data_value_func(df):
def _inner(idx, col):
if col == "gender":
if df.loc[idx, col] == "F":
return "Female"
return "Male"
return df.loc[idx, col]
return _inner
@staticmethod
def data_style_func(df):
def _inner(idx, col):
bg_color = None
number_format = "General"
if col == "count":
number_format = "#,##0"
if df.loc[idx, "gender"] == "F":
bg_color = "bbdef8"
else:
bg_color = "e3f2fd"
return xlsstyle.OpenPyxlStyleHelper.get_style(
bg_color=bg_color, number_format=number_format
)
return _inner
@staticmethod
def index_name_value_func(value):
return value.capitalize()
@staticmethod
def index_name_style_func(value):
return xlsstyle.OpenPyxlStyleHelper.default_header_style()
@staticmethod
def header_value_func(node):
return node.value.capitalize()
@staticmethod
def header_style_func(node):
return xlsstyle.OpenPyxlStyleHelper.default_header_style()
@staticmethod
def index_value_func(node):
return node.value.capitalize()
@staticmethod
def index_style_func(df):
def _inner(node):
bg_color = None
if df.loc[node.value, "gender"] == "F":
bg_color = "bbdef8"
else:
bg_color = "e3f2fd"
return xlsstyle.OpenPyxlStyleHelper.get_style(bg_color=bg_color)
return _inner
@classmethod
def render_xlsx(cls):
# Prepare first data frame (same as in render_xlsx)
df = sample_names_data()
# build presentation model
klass_ = XLSXExample3
pm_all = tc.build_presentation_model(
df=df,
output_format="xlsx",
data_value_func=klass_.data_value_func(df),
data_style_func=klass_.data_style_func(df),
header_value_func=klass_.header_value_func,
header_style_func=klass_.header_style_func,
index_style_func=klass_.index_style_func(df),
index_value_func=klass_.index_value_func,
index_name_style_func=klass_.index_name_style_func,
index_name_func=klass_.index_name_value_func,
)
male_df = top_names_for_year(gender="M")
pm_top_male = tc.build_presentation_model(
df=male_df,
output_format="xlsx",
data_value_func=klass_.data_value_func(male_df),
data_style_func=klass_.data_style_func(male_df),
header_value_func=klass_.header_value_func,
header_style_func=klass_.header_style_func,
index_style_func=klass_.index_style_func(male_df),
index_value_func=klass_.index_value_func,
index_name_style_func=klass_.index_name_style_func,
index_name_func=klass_.index_name_value_func,
)
female_df = top_names_for_year(gender="F")
pm_top_female = tc.build_presentation_model(
df=female_df,
output_format="xlsx",
data_value_func=klass_.data_value_func(female_df),
data_style_func=klass_.data_style_func(female_df),
header_value_func=klass_.header_value_func,
header_style_func=klass_.header_style_func,
index_style_func=klass_.index_style_func(female_df),
index_value_func=klass_.index_value_func,
index_name_style_func=klass_.index_name_style_func,
index_name_func=klass_.index_name_value_func,
)
layout = [pm_all, [pm_top_female, pm_top_male]]
# render to xlsx
tempdir = tempfile.gettempdir()
fp = os.path.join(tempdir, "example3.xlsx")
print("Writing to " + fp)
xlsxw.OpenPyxlCompositor.to_xlsx(
layout=layout, output_fp=fp, orientation="horizontal"
)
# end_XLSXExample3
# start_XLSXExample4
class XLSXExample4:
"""
Demonstrate styling and rendering of multi-hierarchical indexed dataframe
into a xlsx file.
"""
@staticmethod
def data_style_func(df):
def _inner(idx, col):
bg_color = None
number_format = "General"
if col == "count":
number_format = "#,##0"
if idx[1] == "F":
bg_color = "bbdef8"
else:
bg_color = "e3f2fd"
return xlsstyle.OpenPyxlStyleHelper.get_style(
bg_color=bg_color, number_format=number_format
)
return _inner
@staticmethod
def index_name_value_func(value):
return "Max By Year"
@staticmethod
def index_name_style_func(value):
return xlsstyle.OpenPyxlStyleHelper.default_header_style()
@staticmethod
def header_value_func(node):
return node.value.capitalize()
@staticmethod
def header_style_func(node):
return xlsstyle.OpenPyxlStyleHelper.default_header_style()
@staticmethod
def index_value_func(node):
if isinstance(node.value, str):
return node.value.capitalize()
return node.value
@staticmethod
def index_style_func(df):
def _inner(node):
bg_color = None
if len(node.key) == 1:
bg_color = "4f81bd"
elif node.key[1] == "F":
bg_color = "bbdef8"
else:
bg_color = "e3f2fd"
return xlsstyle.OpenPyxlStyleHelper.get_style(bg_color=bg_color)
return _inner
@classmethod
def render_xlsx(cls):
# Prepare first data frame (same as in render_xlsx)
data_df = load_names_data()
data_df = data_df[data_df["year"] >= 2000]
g = data_df.groupby(("year", "gender"))
df = g.max()
klass_ = cls
pm = tc.build_presentation_model(
df=df,
output_format="xlsx",
# data_value_func=None, # use default
data_style_func=klass_.data_style_func(df),
header_value_func=klass_.header_value_func,
header_style_func=klass_.header_style_func,
index_style_func=klass_.index_style_func(df),
index_value_func=klass_.index_value_func,
index_name_style_func=klass_.index_name_style_func,
index_name_func=klass_.index_name_value_func,
)
layout = [pm]
# render to xlsx
tempdir = tempfile.gettempdir()
fp = os.path.join(tempdir, "example4.xlsx")
print("Writing to " + fp)
xlsxw.OpenPyxlCompositor.to_xlsx(
layout=layout, output_fp=fp, orientation="horizontal"
)
# end_XLSXExample4
def main():
XLSXExample1.render_xlsx()
XLSXExample2.render_xlsx()
XLSXExample3.render_xlsx()
XLSXExample4.render_xlsx()
if __name__ == "__main__":
main()
HTML Examples¶
"""
This module is referred to by the Sphinx documentation. If you need to run this
module, install table_compositor in an separate environment and then run this module
in that environment. This helps the imports find the modules in the right place
"""
import collections
import os
# start_imports
import tempfile
import webbrowser
import zipfile
import pandas as pd
import requests
import table_compositor.html_styles as html_style
import table_compositor.html_writer as htmlw
import table_compositor.table_compositor as tc
# end_imports
# start_data_routine
# code snippet adapted from http://function-pipe.readthedocs.io/en/latest/usage_df.html
# source url
URL_NAMES = "https://www.ssa.gov/oact/babynames/names.zip"
ZIP_NAME = "names.zip"
def load_names_data():
fp = os.path.join(tempfile.gettempdir(), ZIP_NAME)
if not os.path.exists(fp):
r = requests.get(URL_NAMES)
with open(fp, "wb") as f:
f.write(r.content)
post = collections.OrderedDict()
with zipfile.ZipFile(fp) as zf:
# get ZipInfo instances
for zi in sorted(zf.infolist(), key=lambda zi: zi.filename):
fn = zi.filename
if fn.startswith("yob"):
year = int(fn[3:7])
df = pd.read_csv(
zf.open(zi), header=None, names=("name", "gender", "count")
)
df["year"] = year
post[year] = df
df = pd.concat(post.values())
df.set_index("name", inplace=True, drop=True)
return df
def sample_names_data():
df = load_names_data()
df = df[(df["year"] == 2015) & (df["count"] > 1000)]
return df.sample(50, random_state=0).sort_values("count")
def top_names_for_year(year=2015, gender="F", top_n=5):
df = load_names_data()
df = df[(df["year"] == year) & (df["gender"] == gender)]
df = df.sort_values("count")[:top_n]
return df
# end_data_routine
# start_HTMLExample1
class HTMLExample1:
"""
Demonstrate rendering of a simple dataframe into html
"""
@classmethod
def render_html(cls):
# load data
df = load_names_data()
df = df[:100]
# build presentation model
pm = tc.build_presentation_model(df=df, output_format="html")
# render to xlsx
tempdir = tempfile.gettempdir()
fp = os.path.join(tempdir, "example_1.html")
layout = [pm]
print("Writing to " + fp)
html = htmlw.HTMLWriter.to_html(layout, border=1)
output_fp = os.path.join(tempfile.gettempdir(), "example1.html")
with open(output_fp, "w") as f:
f.write(html)
# end_HTMLExample1
# start_HTMLExample2
class HTMLExample2:
"""
Demonstrate rendering of a simple dataframe into html
"""
@staticmethod
def data_value_func(df):
def _inner(idx, col):
if col == "gender":
if df.loc[idx, col] == "F":
return "Female"
return "Male"
return df.loc[idx, col]
return _inner
@staticmethod
def data_style_func(df):
def _inner(idx, col):
color = "#FFFFFF"
text_align = "left"
if col == "count":
text_align = "right"
if df.loc[idx, "gender"] == "F":
color = "#bbdef8"
else:
color = "#e3f2fd"
return html_style.td_style(
text_align=text_align,
background_color=color,
color="#000000",
font_weight="normal",
white_space="pre",
padding="10px",
border=None,
)
return _inner
@staticmethod
def index_name_value_func(value):
return value.capitalize()
@staticmethod
def header_value_func(node):
return node.value.capitalize()
@staticmethod
def header_style_func(node):
return html_style.td_style(
text_align="center",
background_color="#4F81BD",
color="#FFFFFF",
font_weight="bold",
white_space="pre",
padding="10px",
border=1,
)
@staticmethod
def index_value_func(node):
return node.value.capitalize()
@staticmethod
def index_style_func(node):
return html_style.td_style(
text_align="center",
background_color="#4F81BD",
color="#FFFFFF",
font_weight="bold",
white_space="pre",
padding="10px",
border=1,
)
@classmethod
def render_html(cls):
# load data
df = sample_names_data()
# build presentation model
klass_ = HTMLExample2
pm = tc.build_presentation_model(
df=df,
output_format="html",
data_value_func=klass_.data_value_func(df),
data_style_func=klass_.data_style_func(df),
header_value_func=klass_.header_value_func,
header_style_func=klass_.header_style_func,
index_style_func=klass_.index_style_func,
index_value_func=klass_.index_value_func,
index_name_func=klass_.index_name_value_func,
)
layout = [pm]
html = htmlw.HTMLWriter.to_html(layout, border=1)
output_fp = os.path.join(tempfile.gettempdir(), "example2.html")
print("Writing to =", output_fp)
with open(output_fp, "w") as f:
f.write(html)
# end_HTMLExample2
# start_HTMLExample3
class HTMLExample3:
"""
Demonstrate styling and rendering of multiple multi-hierarchical indexed dataframe
into a html file
"""
@staticmethod
def data_value_func(df):
def _inner(idx, col):
if col == "gender":
if df.loc[idx, col] == "F":
return "Female"
return "Male"
return df.loc[idx, col]
return _inner
@staticmethod
def data_style_func(df):
def _inner(idx, col):
color = "#FFFFFF"
text_align = "left"
if col == "count":
text_align = "right"
if df.loc[idx, "gender"] == "F":
color = "#bbdef8"
else:
color = "#e3f2fd"
return html_style.td_style(
text_align=text_align,
background_color=color,
color="#000000",
font_weight="normal",
white_space="pre",
padding="10px",
border=None,
)
return _inner
@staticmethod
def index_name_value_func(value):
return "Max By Year"
@staticmethod
def header_value_func(node):
return node.value.capitalize()
@staticmethod
def header_style_func(node):
return html_style.td_style(
text_align="center",
background_color="#4F81BD",
color="#FFFFFF",
font_weight="bold",
white_space="pre",
padding="10px",
border=1,
)
@staticmethod
def index_value_func(node):
if isinstance(node.value, str):
return node.value.capitalize()
return node.value
@staticmethod
def index_style_func(node):
return html_style.td_style(
text_align="center",
background_color="#4F81BD",
color="#FFFFFF",
font_weight="bold",
white_space="pre",
padding="10px",
border=1,
)
@classmethod
def render_html(cls):
# Prepare first data frame (same as in render_xlsx)
df = sample_names_data()
# build presentation model
klass_ = HTMLExample4
pm_all = tc.build_presentation_model(
df=df,
output_format="html",
data_value_func=klass_.data_value_func(df),
data_style_func=klass_.data_style_func(df),
header_value_func=klass_.header_value_func,
header_style_func=klass_.header_style_func,
index_style_func=klass_.index_style_func,
index_value_func=klass_.index_value_func,
index_name_func=lambda _: "Sample Data",
)
male_df = top_names_for_year(gender="M")
pm_top_male = tc.build_presentation_model(
df=male_df,
output_format="html",
data_value_func=klass_.data_value_func(male_df),
data_style_func=klass_.data_style_func(male_df),
header_value_func=klass_.header_value_func,
header_style_func=klass_.header_style_func,
index_style_func=klass_.index_style_func,
index_value_func=klass_.index_value_func,
index_name_func=lambda _: "Max by Year",
)
female_df = top_names_for_year(gender="F")
pm_top_female = tc.build_presentation_model(
df=female_df,
output_format="html",
data_value_func=klass_.data_value_func(female_df),
data_style_func=klass_.data_style_func(female_df),
header_value_func=klass_.header_value_func,
header_style_func=klass_.header_style_func,
index_style_func=klass_.index_style_func,
index_value_func=klass_.index_value_func,
index_name_func=lambda _: "Max by Year",
)
layout = [pm_all, [pm_top_female, pm_top_male]]
# render to xlsx
html = htmlw.HTMLWriter.to_html(layout, border=1, orientation="horizontal")
output_fp = os.path.join(tempfile.gettempdir(), "example3.html")
print("Writing to =", output_fp)
with open(output_fp, "w") as f:
f.write(html)
# end_HTMLExample3
# start_HTMLExample4
class HTMLExample4:
"""
Demonstrate styling and rendering of multi-hierarchical indexed dataframe
into a html file.
"""
@staticmethod
def data_value_func(df):
def _inner(idx, col):
if col == "gender":
if df.loc[idx, col] == "F":
return "Female"
return "Male"
return df.loc[idx, col]
return _inner
@staticmethod
def data_style_func(df):
def _inner(idx, col):
color = "#FFFFFF"
text_align = "left"
if col == "count":
text_align = "right"
if idx[1] == "F":
color = "#bbdef8"
else:
color = "#e3f2fd"
return html_style.td_style(
text_align=text_align,
background_color=color,
color="#000000",
font_weight="normal",
white_space="pre",
padding="10px",
border=None,
)
return _inner
@staticmethod
def index_name_value_func(value):
return "Max By Year"
@staticmethod
def header_value_func(node):
return node.value.capitalize()
@staticmethod
def header_style_func(node):
return html_style.td_style(
text_align="center",
background_color="#4F81BD",
color="#FFFFFF",
font_weight="bold",
white_space="pre",
padding="10px",
border=1,
)
@staticmethod
def index_value_func(node):
if isinstance(node.value, str):
return node.value.capitalize()
return node.value
@staticmethod
def index_style_func(node):
return html_style.td_style(
text_align="center",
background_color="#4F81BD",
color="#FFFFFF",
font_weight="bold",
white_space="pre",
padding="10px",
border=1,
)
@classmethod
def render_html(cls):
# Prepare first data frame (same as in render_xlsx)
data_df = load_names_data()
data_df = data_df[data_df["year"] >= 2000]
g = data_df.groupby(("year", "gender"))
df = g.max()
klass_ = cls
pm = tc.build_presentation_model(
df=df,
output_format="html",
data_value_func=klass_.data_value_func(df),
data_style_func=klass_.data_style_func(df),
header_value_func=klass_.header_value_func,
header_style_func=klass_.header_style_func,
index_style_func=klass_.index_style_func,
index_value_func=klass_.index_value_func,
index_name_func=klass_.index_name_value_func,
)
layout = [pm]
# render to xlsx
html = htmlw.HTMLWriter.to_html(layout, border=1)
output_fp = os.path.join(tempfile.gettempdir(), "example4.html")
print("Writing to =", output_fp)
with open(output_fp, "w") as f:
f.write(html)
# end_HTMLExample4
def main():
HTMLExample1.render_html()
HTMLExample2.render_html()
HTMLExample3.render_html()
HTMLExample4.render_html()
if __name__ == "__main__":
main()