Welcome to table-compositor’s documentation!

Contents:

Introduction

The table-compositor library provides the API to render data stored in table-like data structures. Currently the library only supports rendering data available in a Panda’s DataFrames. The DataFrame layout is used as the table layout(including single and multi hierarchical columns/indices) by the library. The table layout is rendered directly on to an xslx sheet or to a html page. Styling and layout attributes can be used to render colorful xlsx or html reports. The library also supports rendering of multiple data frames into a single xlsx sheet or html page (with horizontal/vertical layouts). The objective of the library is to be able to use the DataFrame as the API to configure the style and layout properties of the report. Callback functions are provided to customize all styling properties. The nice thing about the callback functions are that the style properties are set on cells indexed with index/column values available in the original dataframe used during rendering.

Code: https://github.com/InvestmentSystems/table-compositor

Docs: http://table-compositor.readthedocs.io

Packages: https://pypi.python.org/pypi/table-compositor

Getting Started

The table-compositor library builds on the concept of Panda’s DataFrame as API to render colorful reports. The various ways of providing styling attributes and choosing layouts are demonstrated with numerous examples in the documentation. Please refer to the Bacis section of the documentation to get started with the HelloWorld example.

Installation

A standard setuptools installer is available via PyPI:

https://pypi.python.org/pypi/table-compositor

Or, install via pip3:

pip3 install table-compositor

Source code can be obtained here:

https://github.com/InvestmentSystems/table-compositor

Basics

The purpose of this library is to use the Pandas DataFrame as an interface to represent the layout of a table that needs to be rendered to an xlsx file or as an html table. The library abstracts away the tedious work of working at the cell level of an xlsx sheet or a html table. It provides a call-back mechanism by which the user is able to provide values that need to be rendered and also the styling that needs to be used for each cell in the rendered table. The library is also capable of laying out multiple tables in the same sheet which are evenly spaced vertically or horizontally based on the layout configuration provided.

Sample Data

During the later part of this documentation, we will use the sample data from the Social Security Administration which contains the U.S. child birth name records. We choose this sample data for two reasons. We reuse some of the discussion that are outlined by Wes McKinnery’s Python For Data Analysis, 2nd Edition(2017). The same data is also used in the documentation of another library function-pipe <http://function-pipe.readthedocs.io/en/latest/index.html> that the Investment Systems Group has open-sourced.

https://www.ssa.gov/oact/babynames/names.zip

Further more, we will assume that a flattened file from all the smaller files in the .zip file is available after we invoke the following function.

Please refer to the XLSX Examples section for code that loads this data.

A Hello World Example: Dataframe to Xlsx

Every use of this library involves four steps.

  1. We build a dataframe that resembles the shape of the table that will be rendered.

  2. The dataframe is passed as an argument to the function called build_presentation_model. This function accepts a dataframe and also a number of functions as arguments. We call the value returned by this function, the presentation_model.

  3. Create a layout of multiple presentation models (if we want more than one table rendered in same xlsx sheet or same html page)

  4. Call the render_xlsx or render_html functions on the respective writers. For xlsx files either OpenPyxlCompositor(uses openpyxl library) or XlsxWriterCompositor(uses xlsxwriter library). For HTML use the HTMLWriter.

A Quick Look at a Xlsx example

We will start with a simple dataframe and render the dataframe as-is to a xlsx file

import pandas as pd
from table_compositor.table_compositor import build_presentation_model
from table_compositior.xlsx_writer import OpenPyxlCompositor
# Note: use XlsxWriterCompositor to use xlsxwriter library

sample_df = pd.DataFrame(dict(a=[10, 20, 30, 40, 50], b=[0.1, 0.9,0.2, 0.6,0.3]), index=[1,2,3,4,5])

# create a presentation model
# defaults to engine='openpyxl'. Needs to be set to 'xlsxwriter' to use `xlsxwriter` library instead.
presentation_model = build_presentation_model(df=sample_df)

# create a layout, which is usually a nested list of presentation models
layout = [presentation_model]

# render to xlsx
output_fp = '/tmp/example1.xlsx'
OpenPyxlCompositor.to_xlsx(layout, output_fp=output_fp)

Running this code produces the following output:

_images/xlsx_basic_example1.png

In the above code snippet, we first created a dataframe called sample_df.

To render this dataframe, we first invoke build_presentation_model. The build_presentation_model accepts the dataframe as its first argument. In this example, we use the defaults provided by this method for all other arguments. The build_presentation_model returns an presentation_model object.

Before we call OpenPyxlCompositor.to_xlsx we create a layout. A layout is a nested list of presentation_models. In our case, since we have only one presentation_model we create a list with a single element. Later on when we work with multiple presentation models that need to be rendered on to the same sheet, we could create nested list such as [[model1, model2], [model3]] etc.

Building the Presentation Model

The build_presentation_model function is the most important interface in this library. This function exposes all the functionality that is required to render beautiful looking excel worksheets or html tables.

We will now build up on our previous example and add styling to the report we generate. Before, we do that lets take a quick look at the signature of build_presentation__model.

table_compositor.table_compositor.build_presentation_model(*, df, output_format='xlsx', data_value_func=None, column_style_func=None, data_style_func=None, header_style_func=None, header_value_func=None, index_style_func=None, index_value_func=None, index_name_func=None, index_name_style_func=None, engine='openpyxl', **kwargs)[source]

Construct and return the presentation model that will be used while rendering to html/xlsx formats. The returned object has all the information required to render the tables in the requested format. The details of the object is transparent to the caller. It is only exposed for certain advanced operations.

Parameters
  • df – The dataframe representation of the table. The shape of the dataframe closely resembles the table that will be rendered in the requested format.

  • output_format – ‘html’ or ‘xlsx’

  • data_value_func – example: lambda idx, col: df.loc[idx, col], assuming df is in the closure. This can be None, if no data transformation is required to the values already present in the source df

  • column_style_func – the function can substitute the data_style_func, if the same style can be applied for the whole column. This argument should be prefered over the data_style_func argument. Using this option provides better performance since the fewer objects will be created internally and fewer callbacks are made to this function when compared to data_style_func.This argument only applies to the data contained in the dataframe and not the cell where the headers are rendered. For fine grained control at cell level, the data_style_func argument can be used. For more information on return values of this function, refer to the documentation for data_style_func argument.

  • data_style_func – used to provide style at the cell level. Example: lambda idx, col: return dict(font=Font(…)), where Font is the openpyxl object and font is the attr available in the cell instance of openpyxl. For xlsx, the keys in the dict are the attrs of the cell object in openpyxl and the values correspond to the value of that attribute. Example are found in xlsx_styles module. For html, the key-value pairs are any values that go into to the style attribute of a td, th cell in html. Examples are found in html_styles module. example: dict(background-color=’#F8F8F8’). When performance becomes an issue, and cell level control is not needed, it is recommended to use the column_style_func argument rathat than this argument. If the prefered engine is XlswWriter, then the style dictionary returned should have key/values compatible with the Format object declarted in the XlsxWriter library. A reference can be found in ``xlsx_styles.XlsxWriterStyleHelper` class

  • header_value_func – func that takes a object of type IndexNode. The IndexNode contains the attributes that refer to the header being rendered. The returned value from this function is displayed in place of the header in the dataframe at the location. The two properties available on the IndexNode object are value and key. The key is useful to identify the exact index and level in context while working with multi-hierarchical columns.

  • header_style_func – func that takes a object of type IndexNode. The return value of this function is similar to data_style_func.

  • index_value_func – func that takes a object of type IndexNode. The IndexNode contains the attributes that refer to the index being rendered. The returned value from this function is displayed in place of the index in the dataframe at the location.

  • index_style_func – func that takes a object of type IndexNode. The return value of this function is similar to data_style_func.

  • index_name_func – func that returns a string for index name (value to be displayed on top-left corner, above the index column)

  • index_name_style – the style value same as data_style_func that will be used to style the cell

  • engine – required while building presentation model for xlsx. Argument ignored for HTML rendering. This argument is used to provide the default callback style functions, where the style dictionary returned by the callback functions should be compatible with the engine being used.

  • kwargs

    ‘hide_index’ - if True, then hide the index column, default=False

    ’hide_header, - if True, then hide the header, default=False

    ’use_convert’ - if True, do some conversions from dataframe values to values excel can understand for example np.NaN are converted to NaN strings

Returns

A presentation model, to be used to create layout and provide the layout to the html or xlsx writers.

About the callback functions provided as arguments:

Note that callback function provided as arguments to this function are provided with either a tuple of index, col arguments are some information regarding the index or headers being rendered. Therefore, a common pattern would be to capture the dataframe being rendered in a closure of this callback func before passing them as arugments.

For example:

df = pd.DataFrame(dict(a=[1, 2, 3]))

def data_value_func():
def _inner(idx, col):

return df.loc[idx, col] * 10.3

return _inner

pm = build_presentation_model(df=df, data_value_func=data_value_func())

Improving on our first iteration

Now, that we got a overview of the build_presentation_mode function, lets try setting these arguments to improve the look of our reports.

Say, we have the following requirements:

  1. Display column ‘A’ as in dollar format.

  2. Display column ‘B’ as percentage values.’

  3. Set back-ground color of column ‘B’ to red if value is less than 50%

  4. Capitalize all the column headers and add a yellow background

  5. Multiply all index values by 100 while rendering and add a color to the background.

  6. Display a ‘custom text’ on the top left corner, where pandas whole usually display the index name if available.

We update our previous example to do the following:

 1import os
 2import tempfile
 3
 4import pandas as pd
 5
 6from table_compositor.table_compositor import build_presentation_model
 7from table_compositor.xlsx_styles import OpenPyxlStyleHelper
 8
 9# There are equivalent classes for using xlsxwriter library. Namely,
10# XlsxWriterCompositor and XlsxWriterStyleHelper
11from table_compositor.xlsx_writer import OpenPyxlCompositor
12
 1def basic_example2():
 2
 3    df = pd.DataFrame(
 4        dict(a=[10, 20, 30, 40, 50], b=[0.1, 0.9, 0.2, 0.6, 0.3]), index=[1, 2, 3, 4, 5]
 5    )
 6
 7    def style_func(idx, col):
 8        if col == "b":
 9            return OpenPyxlStyleHelper.get_style(number_format="0.00%")
10        else:
11            # for 'a' we do dollar format
12            return OpenPyxlStyleHelper.get_style(number_format="$#,##.00")
13
14    # create a presentation model
15    # note the OpenPyxlStyleHelper function available in xlsx_styles module. But a return value of style function
16    # can be any dict whose keys are attributes of the OpenPyxl cell object.
17    presentation_model = build_presentation_model(
18        df=df,
19        data_value_func=lambda idx, col: df.loc[idx, col] * 10
20        if col == "a"
21        else df.loc[idx, col],
22        data_style_func=style_func,
23        header_value_func=lambda node: node.value.capitalize(),
24        header_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
25        index_value_func=lambda node: node.value * 100,
26        index_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
27        index_name_func=lambda _: "Basic Example",
28        index_name_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
29    )
30
31    # create a layout, which is usually a nested list of presentation models
32    layout = [presentation_model]
33
34    # render to xlsx
35    output_fp = os.path.join(tempfile.gettempdir(), "basic_example2.xlsx")
36    OpenPyxlCompositor.to_xlsx(layout=layout, output_fp=output_fp)
37
38

On line 3 we create the dataframe.

To satisfy the requirements we listed above we pass the callback function to the build_presentation_model. Note that some helper functions are available in xlsx_style function to create styles for openpyxl. But, any other dict with keys that are attr of cell object of openpyxl should work. The above example produces the output as shown below:

_images/xlsx_basic_example2.png

Multi-hierarchical columns and indices

Rendering dataframes with multi-hierarchical columns or indices are very similar to rendering the simpler dataframes. The data_value_func and data_style_func work the same way. The functions that handle index cell rendering and column header rendering can access the IndexNode object that is passed to those functions to determine the value and level that is currently being rendered. This becomes clearer with an example.

We demonstrate this by setting a variety of colors to each cell that holds one of the values of the hierarchical columns or indices.

Note that the IndexNode argument passed to the callback function has a node.key field that unique identifies each cell with a name that is built appending the value of each item in the index or column hierarchy.

 1import os
 2import tempfile
 3
 4import pandas as pd
 5
 6from table_compositor.table_compositor import build_presentation_model
 7from table_compositor.xlsx_styles import OpenPyxlStyleHelper
 8
 9# There are equivalent classes for using xlsxwriter library. Namely,
10# XlsxWriterCompositor and XlsxWriterStyleHelper
11from table_compositor.xlsx_writer import OpenPyxlCompositor
12
 1def basic_example3():
 2
 3    df = pd.DataFrame(
 4        dict(
 5            a=[10, 20, 30, 40],
 6            b=[0.1, 0.9, 0.2, 0.6],
 7            d=[50, 60, 70, 80],
 8            e=[200, 300, 400, 500],
 9        )
10    )
11    df.columns = pd.MultiIndex.from_tuples(
12        [("A", "x"), ("A", "y"), ("B", "x"), ("B", "y")]
13    )
14    df.index = pd.MultiIndex.from_tuples([(1, 100), (1, 200), (2, 100), (2, 200)])
15    print(df)
16
17    def index_style_func(node):
18        # node.key here could be one of (1,), (1, 100), (2,), (2, 100), (2, 200)
19        bg_color = "FFFFFF"
20        if node.key == (1,) or node.key == (2,):
21            bg_color = "9E80B8"
22        elif node.key[1] == 100:
23            bg_color = "4F90C1"
24        elif node.key[1] == 200:
25            bg_color = "6DC066"
26        return OpenPyxlStyleHelper.get_style(bg_color=bg_color)
27
28    def header_style_func(node):
29        bg_color = "FFFFFF"
30        if node.key == ("A",) or node.key == ("B",):
31            bg_color = "9E80B8"
32        elif node.key[1] == "x":
33            bg_color = "4F90C1"
34        elif node.key[1] == "y":
35            bg_color = "6DC066"
36        return OpenPyxlStyleHelper.get_style(bg_color=bg_color)
37
38    # create a presentation model
39    # note the OpenPyxlStyleHeloer function available in xlsx_styles module. But a return value of style function
40    # can be any dict whose keys are attributes of the OpenPyxl cell object.
41    presentation_model = build_presentation_model(
42        df=df,
43        index_style_func=index_style_func,
44        header_style_func=header_style_func,
45        index_name_func=lambda _: "Multi-Hierarchy Example",
46    )
47
48    # create a layout, which is usually a nested list of presentation models
49    layout = [presentation_model]
50
51    # render to xlsx
52    output_fp = os.path.join(tempfile.gettempdir(), "basic_example3.xlsx")
53    OpenPyxlCompositor.to_xlsx(layout=layout, output_fp=output_fp)
54
55

The above function gives us the xlsx file shown below. Note the colors used to render the indices and columns and review how the two functions, namely, index_style_function and header_style_function provide the colors based on the IndexNode attributes. You will notice the use of node.key in these functions to identify each cell uniquely.

_images/xlsx_basic_example3.png

Layouts

Apart from providing styling and formatting facilities, the library also provides a powerful way to layout multiple tables on one sheet. In this section we will look at some examples.

We will use the same presentation model from basic_example2(). We will layout the presentation models with different layouts.

 1import os
 2import tempfile
 3
 4import pandas as pd
 5
 6from table_compositor.table_compositor import build_presentation_model
 7from table_compositor.xlsx_styles import OpenPyxlStyleHelper
 8
 9# There are equivalent classes for using xlsxwriter library. Namely,
10# XlsxWriterCompositor and XlsxWriterStyleHelper
11from table_compositor.xlsx_writer import OpenPyxlCompositor
12
 1def layout_example1():
 2
 3    df = pd.DataFrame(
 4        dict(a=[10, 20, 30, 40, 50], b=[0.1, 0.9, 0.2, 0.6, 0.3]), index=[1, 2, 3, 4, 5]
 5    )
 6
 7    def style_func(idx, col):
 8        if col == "b":
 9            return OpenPyxlStyleHelper.get_style(number_format="0.00%")
10        else:
11            # for 'a' we do dollar format
12            return OpenPyxlStyleHelper.get_style(number_format="$#,##.00")
13
14    # create a presentation model
15    # note the OpenPyxlStyleHeloer function available in xlsx_styles module. But a return value of style function
16    # can be any dict whose keys are attributes of the OpenPyxl cell object.
17    presentation_model = build_presentation_model(
18        df=df,
19        data_value_func=lambda idx, col: df.loc[idx, col] * 10
20        if col == "a"
21        else df.loc[idx, col],
22        data_style_func=style_func,
23        header_value_func=lambda node: node.value.capitalize(),
24        header_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
25        index_value_func=lambda node: node.value * 100,
26        index_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
27        index_name_func=lambda _: "Basic Example",
28        index_name_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
29    )
30
31    # start_layout_code_1
32    # create a layout, which is usually a nested list of presentation models
33    layout = [[presentation_model], [[presentation_model], [presentation_model]]]
34
35    # render to xlsx
36    output_fp = os.path.join(tempfile.gettempdir(), "layout_vertical_example1.xlsx")
37    # the default value for orientation is 'vertical'
38    OpenPyxlCompositor.to_xlsx(
39        layout=layout, output_fp=output_fp, orientation="vertical"
40    )
41
42    output_fp = os.path.join(tempfile.gettempdir(), "layout_horizontal_example1.xlsx")
43    OpenPyxlCompositor.to_xlsx(
44        layout=layout, output_fp=output_fp, orientation="horizontal"
45    )
46    print("Writing xlsx file=", output_fp)
47
48    # mutiple nesting
49    layout_complex = [
50        presentation_model,
51        [presentation_model, [presentation_model, presentation_model]],
52    ]
53
54    output_fp = os.path.join(tempfile.gettempdir(), "layout_complex_example1.xlsx")
55    OpenPyxlCompositor.to_xlsx(
56        layout=layout_complex, output_fp=output_fp, orientation="vertical"
57    )
58    print("Writing xlsx file=", output_fp)
59    # end_layout_code_1
60
61

In the preceding example, we create two layouts. On line 28, we have a layout defined and then rendered to two files with different orientations.

When the orientation is vertical, then each item (presentation_model) in the list is layed out vertically. The orientation flips between vertical and horizontal for every nested listed that is encountered. In this example, you will notice that since the second item in the outer list is a list, the two presentation models in the inner list are rendered side-by-side (i.e. with horizontal orientation)

_images/layout_example1_1.png

When the value of orientation argument is changed to horizontal, the renderer renders the outerlist horizontally and flips the orientation of inner lists to vertical. The second output is show below.

_images/layout_example1_2.png

Example of XLSX Styles

In the preceding examples, we have used the functions provided by xslx_styles.OpenPyxlStyleHelper to return the required style dictionary. Some examples of style dictionaries that can be returned by functions returning styles are provided below for reference. For details on how to build style attributes refer to the openpyxl documentation.

Style with background color

from openpyxl.styles import PatternFill
from openpyxl.styles import fills

fill = PatternFill(fgColor=Color('4f81BD'), patternType=fills.FILL_SOLID)

# note that we return a dict, whose key = `fill` which is an
# attrbute of  `cell` object in `openpyxl`
style = dict(fill=fill)

Style with percentage formatting

number_format = '0.00%'

fill = PatternFill(fgColor=Color('4f81BD'), patternType=fills.FILL_SOLID)

# note that we return a dict, whose key = `number_format` which is an
# attrbute of  `cell` object in `openpyxl`
style = dict(number_format=number_format)

Style with alignment and fonts

from openpyxl.styles import Alignment
from openpyxl.styles import Font
font=Font(bold=True, color='FFFFFF')

# note that we return a dict, whose key = `number_format` which is an
# attrbute of  `cell` object in `openpyxl`
style = dict(alignment=Alignment(horizontal=center, font=font)

Using a different XLXS Writer Engine

Note that if xlsxwriter library is uses, the keys in the dictionary returned by the callback funcs should match the keys required to build the Format object declared in the xlsxwriter library. Some examples of these keys can be found in xlsx_styles.XlsxWriterStyleHelper class.

Example of HTML Styles

In the Basic section, we only saw examples of how to render dataframes to a xlsx format. The same setup can be used to render dataframes to HTML using the html_writer.HTMLWriter.to_html function. The only thing that has to be changed is the style attributes that are being returned. We want our style providing functions to return style attributes that can be inlined into the style attribute of a <td> or <th> tag.

Some examples of style dictionaries that can be return by functions returning styles are provided below for reference.

Style for headers

style = dict(
 text_align='center',
 background_color='#4F81BD',
 color='#FFFFFF',
 font_weight='bold',
 white_space='pre',
 padding='10px',
border=1)

Style for cell holding numeric values

numeric_style = dict(
             text_align='right',
             background_color='#FFFFFF',
             color='#000000',
             font_weight='normal',
             white_space='pre',
             padding='10px',
             border=None)

Style using the html_styles.td_style object

style = td_style(
            text_align='center',
            background_color='#4F81BD',
            color='#FFFFFF',
            font_weight='bold',
            white_space='pre',
            padding='10px',
            border=1)

XLSX Examples

This page provides a list of examples that demonstrate rendering xlsx output from the given dataframe. Each example is self-contained inside a class. We have some helper functions to provide the data we need for this examples

All examples listed in this section use OpenPyxlStyleHelper and OpenPyxlCompositor. To use the xlsxwriter library replace these with XlsxWriterStyleHelper and XlsxWriterCompositor respectively.

If the callback funcs do not use the *StyleHelpers and return their own Style objects then the objects returned should be compatible with the value engine provided to the engine attribute.

Examples of relevant style objects can be found respective documentations for the engine being used.

Helper Functions (Data loading routines)

 1import tempfile
 2import webbrowser
 3import zipfile
 4
 5import pandas as pd
 6import requests
 7
 8import table_compositor.table_compositor as tc
 9import table_compositor.xlsx_styles as xlsstyle
10import table_compositor.xlsx_writer as xlsxw
11
 1# code snippet adapted from http://function-pipe.readthedocs.io/en/latest/usage_df.html
 2# source url
 3URL_NAMES = "https://www.ssa.gov/oact/babynames/names.zip"
 4ZIP_NAME = "names.zip"
 5
 6
 7def load_names_data():
 8    fp = os.path.join(tempfile.gettempdir(), ZIP_NAME)
 9    if not os.path.exists(fp):
10        r = requests.get(URL_NAMES)
11        with open(fp, "wb") as f:
12            f.write(r.content)
13
14    post = collections.OrderedDict()
15    with zipfile.ZipFile(fp) as zf:
16        # get ZipInfo instances
17        for zi in sorted(zf.infolist(), key=lambda zi: zi.filename):
18            fn = zi.filename
19            if fn.startswith("yob"):
20                year = int(fn[3:7])
21                df = pd.read_csv(
22                    zf.open(zi), header=None, names=("name", "gender", "count")
23                )
24                df["year"] = year
25                post[year] = df
26
27        df = pd.concat(post.values())
28        df.set_index("name", inplace=True, drop=True)
29        return df
30
31
32def sample_names_data():
33    df = load_names_data()
34    df = df[(df["year"] == 2015) & (df["count"] > 1000)]
35    return df.sample(100, random_state=0).sort_values("count")
36
37
38def top_names_for_year(year=2015, gender="F", top_n=5):
39    df = load_names_data()
40    df = df[(df["year"] == year) & (df["gender"] == gender)]
41    df = df.sort_values("count")[:top_n]
42    return df
43
44

Example 1 - DataFrame with default styles

Demonstrates converting dataframe into html format with default styles.

 1class XLSXExample1:
 2    """
 3    Demonstrates rendering a simple dataframe to a xlsx file
 4    using the default styles
 5    """
 6
 7    @classmethod
 8    def render_xlsx(cls):
 9        """
10        Render the df to a xlsx file.
11        """
12
13        # load data
14        df = sample_names_data()
15        # build presentation model
16        pm = tc.build_presentation_model(df=df, output_format="xlsx")
17
18        # render to xlsx
19        tempdir = tempfile.gettempdir()
20        fp = os.path.join(tempdir, "example1.xlsx")
21        layout = [pm]
22        print("Writing to " + fp)
23        xlsxw.OpenPyxlCompositor.to_xlsx(layout=layout, output_fp=fp)
24
25
_images/xlsx_example1.png

Example 2 - DataFrame with custom styles

In this example, we format the different components of dataframe with various styling attributes

 1class XLSXExample2:
 2    """
 3    Demonstrates using call-backs that help set the display and style
 4    properties of each cell in the xlsx sheet.
 5    """
 6
 7    @staticmethod
 8    def data_value_func(df):
 9        def _inner(idx, col):
10            if col == "gender":
11                if df.loc[idx, col] == "F":
12                    return "Female"
13                return "Male"
14            return df.loc[idx, col]
15
16        return _inner
17
18    @staticmethod
19    def data_style_func(df):
20        def _inner(idx, col):
21            bg_color = None
22            number_format = "General"
23            if col == "count":
24                number_format = "#,##0"
25            if df.loc[idx, "gender"] == "F":
26                bg_color = "bbdef8"
27            else:
28                bg_color = "e3f2fd"
29            return xlsstyle.OpenPyxlStyleHelper.get_style(
30                bg_color=bg_color, number_format=number_format
31            )
32
33        return _inner
34
35    @staticmethod
36    def index_name_value_func(value):
37        return value.capitalize()
38
39    @staticmethod
40    def index_name_style_func(value):
41        return xlsstyle.OpenPyxlStyleHelper.default_header_style()
42
43    @staticmethod
44    def header_value_func(node):
45        return node.value.capitalize()
46
47    @staticmethod
48    def header_style_func(node):
49        return xlsstyle.OpenPyxlStyleHelper.default_header_style()
50
51    @staticmethod
52    def index_value_func(node):
53        return node.value.capitalize()
54
55    @staticmethod
56    def index_style_func(df):
57        def _inner(node):
58            bg_color = None
59            if df.loc[node.value, "gender"] == "F":
60                bg_color = "bbdef8"
61            else:
62                bg_color = "e3f2fd"
63            return xlsstyle.OpenPyxlStyleHelper.get_style(bg_color=bg_color)
64
65        return _inner
66
67    @classmethod
68    def render_xlsx(cls):
69        # load data
70        df = sample_names_data()
71        # build presentation model
72        klass_ = XLSXExample2
73        pm = tc.build_presentation_model(
74            df=df,
75            output_format="xlsx",
76            data_value_func=klass_.data_value_func(df),
77            data_style_func=klass_.data_style_func(df),
78            header_value_func=klass_.header_value_func,
79            header_style_func=klass_.header_style_func,
80            index_style_func=klass_.index_style_func(df),
81            index_value_func=klass_.index_value_func,
82            index_name_style_func=klass_.index_name_style_func,
83            index_name_func=klass_.index_name_value_func,
84        )
85
86        # render to xlsx
87        tempdir = tempfile.gettempdir()
88        fp = os.path.join(tempdir, "example2.xlsx")
89        layout = [pm]
90        print("Writing to " + fp)
91        xlsxw.OpenPyxlCompositor.to_xlsx(layout=layout, output_fp=fp)
92
93
_images/xlsx_example2.png

Example 3 - Simple DataFrame with Layouts

Demonstrates rendering multi-dataframes in one worksheet along with common functions for styling

  1class XLSXExample3:
  2    """
  3    Demonstrates using call-backs and also rendering multiple tables to single
  4    worksheet.
  5    """
  6
  7    @staticmethod
  8    def data_value_func(df):
  9        def _inner(idx, col):
 10            if col == "gender":
 11                if df.loc[idx, col] == "F":
 12                    return "Female"
 13                return "Male"
 14            return df.loc[idx, col]
 15
 16        return _inner
 17
 18    @staticmethod
 19    def data_style_func(df):
 20        def _inner(idx, col):
 21            bg_color = None
 22            number_format = "General"
 23            if col == "count":
 24                number_format = "#,##0"
 25            if df.loc[idx, "gender"] == "F":
 26                bg_color = "bbdef8"
 27            else:
 28                bg_color = "e3f2fd"
 29            return xlsstyle.OpenPyxlStyleHelper.get_style(
 30                bg_color=bg_color, number_format=number_format
 31            )
 32
 33        return _inner
 34
 35    @staticmethod
 36    def index_name_value_func(value):
 37        return value.capitalize()
 38
 39    @staticmethod
 40    def index_name_style_func(value):
 41        return xlsstyle.OpenPyxlStyleHelper.default_header_style()
 42
 43    @staticmethod
 44    def header_value_func(node):
 45        return node.value.capitalize()
 46
 47    @staticmethod
 48    def header_style_func(node):
 49        return xlsstyle.OpenPyxlStyleHelper.default_header_style()
 50
 51    @staticmethod
 52    def index_value_func(node):
 53        return node.value.capitalize()
 54
 55    @staticmethod
 56    def index_style_func(df):
 57        def _inner(node):
 58            bg_color = None
 59            if df.loc[node.value, "gender"] == "F":
 60                bg_color = "bbdef8"
 61            else:
 62                bg_color = "e3f2fd"
 63            return xlsstyle.OpenPyxlStyleHelper.get_style(bg_color=bg_color)
 64
 65        return _inner
 66
 67    @classmethod
 68    def render_xlsx(cls):
 69        # Prepare first data frame (same as in render_xlsx)
 70        df = sample_names_data()
 71        # build presentation model
 72        klass_ = XLSXExample3
 73        pm_all = tc.build_presentation_model(
 74            df=df,
 75            output_format="xlsx",
 76            data_value_func=klass_.data_value_func(df),
 77            data_style_func=klass_.data_style_func(df),
 78            header_value_func=klass_.header_value_func,
 79            header_style_func=klass_.header_style_func,
 80            index_style_func=klass_.index_style_func(df),
 81            index_value_func=klass_.index_value_func,
 82            index_name_style_func=klass_.index_name_style_func,
 83            index_name_func=klass_.index_name_value_func,
 84        )
 85
 86        male_df = top_names_for_year(gender="M")
 87        pm_top_male = tc.build_presentation_model(
 88            df=male_df,
 89            output_format="xlsx",
 90            data_value_func=klass_.data_value_func(male_df),
 91            data_style_func=klass_.data_style_func(male_df),
 92            header_value_func=klass_.header_value_func,
 93            header_style_func=klass_.header_style_func,
 94            index_style_func=klass_.index_style_func(male_df),
 95            index_value_func=klass_.index_value_func,
 96            index_name_style_func=klass_.index_name_style_func,
 97            index_name_func=klass_.index_name_value_func,
 98        )
 99
100        female_df = top_names_for_year(gender="F")
101        pm_top_female = tc.build_presentation_model(
102            df=female_df,
103            output_format="xlsx",
104            data_value_func=klass_.data_value_func(female_df),
105            data_style_func=klass_.data_style_func(female_df),
106            header_value_func=klass_.header_value_func,
107            header_style_func=klass_.header_style_func,
108            index_style_func=klass_.index_style_func(female_df),
109            index_value_func=klass_.index_value_func,
110            index_name_style_func=klass_.index_name_style_func,
111            index_name_func=klass_.index_name_value_func,
112        )
113
114        layout = [pm_all, [pm_top_female, pm_top_male]]
115        # render to xlsx
116        tempdir = tempfile.gettempdir()
117        fp = os.path.join(tempdir, "example3.xlsx")
118        print("Writing to " + fp)
119        xlsxw.OpenPyxlCompositor.to_xlsx(
120            layout=layout, output_fp=fp, orientation="horizontal"
121        )
122
123
_images/xlsx_example3.png

Example 4 - DataFrames with Multi-hierarchical columns and indices

Demonstrates rendering dataframes with multi-hierarchical indices and mult-hierarchical columns

 1class XLSXExample4:
 2    """
 3    Demonstrate styling and rendering of multi-hierarchical indexed dataframe
 4    into a xlsx file.
 5    """
 6
 7    @staticmethod
 8    def data_style_func(df):
 9        def _inner(idx, col):
10            bg_color = None
11            number_format = "General"
12            if col == "count":
13                number_format = "#,##0"
14            if idx[1] == "F":
15                bg_color = "bbdef8"
16            else:
17                bg_color = "e3f2fd"
18            return xlsstyle.OpenPyxlStyleHelper.get_style(
19                bg_color=bg_color, number_format=number_format
20            )
21
22        return _inner
23
24    @staticmethod
25    def index_name_value_func(value):
26        return "Max By Year"
27
28    @staticmethod
29    def index_name_style_func(value):
30        return xlsstyle.OpenPyxlStyleHelper.default_header_style()
31
32    @staticmethod
33    def header_value_func(node):
34        return node.value.capitalize()
35
36    @staticmethod
37    def header_style_func(node):
38        return xlsstyle.OpenPyxlStyleHelper.default_header_style()
39
40    @staticmethod
41    def index_value_func(node):
42        if isinstance(node.value, str):
43            return node.value.capitalize()
44        return node.value
45
46    @staticmethod
47    def index_style_func(df):
48        def _inner(node):
49            bg_color = None
50            if len(node.key) == 1:
51                bg_color = "4f81bd"
52            elif node.key[1] == "F":
53                bg_color = "bbdef8"
54            else:
55                bg_color = "e3f2fd"
56            return xlsstyle.OpenPyxlStyleHelper.get_style(bg_color=bg_color)
57
58        return _inner
59
60    @classmethod
61    def render_xlsx(cls):
62
63        # Prepare first data frame (same as in render_xlsx)
64        data_df = load_names_data()
65        data_df = data_df[data_df["year"] >= 2000]
66        g = data_df.groupby(("year", "gender"))
67        df = g.max()
68
69        klass_ = cls
70        pm = tc.build_presentation_model(
71            df=df,
72            output_format="xlsx",
73            # data_value_func=None,   # use default
74            data_style_func=klass_.data_style_func(df),
75            header_value_func=klass_.header_value_func,
76            header_style_func=klass_.header_style_func,
77            index_style_func=klass_.index_style_func(df),
78            index_value_func=klass_.index_value_func,
79            index_name_style_func=klass_.index_name_style_func,
80            index_name_func=klass_.index_name_value_func,
81        )
82
83        layout = [pm]
84        # render to xlsx
85        tempdir = tempfile.gettempdir()
86        fp = os.path.join(tempdir, "example4.xlsx")
87        print("Writing to " + fp)
88        xlsxw.OpenPyxlCompositor.to_xlsx(
89            layout=layout, output_fp=fp, orientation="horizontal"
90        )
91
92
_images/xlsx_example4.png

HTML Examples

This page provides a list of examples that demonstrate rendering html tables from the given dataframe. Each example is self-contained inside a class. We have some helper functions to provide the data we need for this examples

Helper Functions (Data loading routines)

 1import tempfile
 2import webbrowser
 3import zipfile
 4
 5import pandas as pd
 6import requests
 7
 8import table_compositor.html_styles as html_style
 9import table_compositor.html_writer as htmlw
10import table_compositor.table_compositor as tc
11
 1# code snippet adapted from http://function-pipe.readthedocs.io/en/latest/usage_df.html
 2# source url
 3URL_NAMES = "https://www.ssa.gov/oact/babynames/names.zip"
 4ZIP_NAME = "names.zip"
 5
 6
 7def load_names_data():
 8    fp = os.path.join(tempfile.gettempdir(), ZIP_NAME)
 9    if not os.path.exists(fp):
10        r = requests.get(URL_NAMES)
11        with open(fp, "wb") as f:
12            f.write(r.content)
13
14    post = collections.OrderedDict()
15    with zipfile.ZipFile(fp) as zf:
16        # get ZipInfo instances
17        for zi in sorted(zf.infolist(), key=lambda zi: zi.filename):
18            fn = zi.filename
19            if fn.startswith("yob"):
20                year = int(fn[3:7])
21                df = pd.read_csv(
22                    zf.open(zi), header=None, names=("name", "gender", "count")
23                )
24                df["year"] = year
25                post[year] = df
26
27        df = pd.concat(post.values())
28        df.set_index("name", inplace=True, drop=True)
29        return df
30
31
32def sample_names_data():
33    df = load_names_data()
34    df = df[(df["year"] == 2015) & (df["count"] > 1000)]
35    return df.sample(50, random_state=0).sort_values("count")
36
37
38def top_names_for_year(year=2015, gender="F", top_n=5):
39    df = load_names_data()
40    df = df[(df["year"] == year) & (df["gender"] == gender)]
41    df = df.sort_values("count")[:top_n]
42    return df
43
44

Example 1 - DataFrame with default styles

Demonstrates converting dataframe into html format with default styles.

 1class HTMLExample1:
 2    """
 3    Demonstrate rendering of a simple dataframe into html
 4    """
 5
 6    @classmethod
 7    def render_html(cls):
 8
 9        # load data
10        df = load_names_data()
11        df = df[:100]
12
13        # build presentation model
14        pm = tc.build_presentation_model(df=df, output_format="html")
15
16        # render to xlsx
17        tempdir = tempfile.gettempdir()
18        fp = os.path.join(tempdir, "example_1.html")
19        layout = [pm]
20        print("Writing to " + fp)
21        html = htmlw.HTMLWriter.to_html(layout, border=1)
22        output_fp = os.path.join(tempfile.gettempdir(), "example1.html")
23        with open(output_fp, "w") as f:
24            f.write(html)
25
26
_images/html_example1.png

Example 2 - DataFrame with custom styles

In this example, we format the different components of dataframe with various styling attributes

  1class HTMLExample2:
  2    """
  3    Demonstrate rendering of a simple dataframe into html
  4    """
  5
  6    @staticmethod
  7    def data_value_func(df):
  8        def _inner(idx, col):
  9            if col == "gender":
 10                if df.loc[idx, col] == "F":
 11                    return "Female"
 12                return "Male"
 13            return df.loc[idx, col]
 14
 15        return _inner
 16
 17    @staticmethod
 18    def data_style_func(df):
 19        def _inner(idx, col):
 20            color = "#FFFFFF"
 21            text_align = "left"
 22            if col == "count":
 23                text_align = "right"
 24            if df.loc[idx, "gender"] == "F":
 25                color = "#bbdef8"
 26            else:
 27                color = "#e3f2fd"
 28            return html_style.td_style(
 29                text_align=text_align,
 30                background_color=color,
 31                color="#000000",
 32                font_weight="normal",
 33                white_space="pre",
 34                padding="10px",
 35                border=None,
 36            )
 37
 38        return _inner
 39
 40    @staticmethod
 41    def index_name_value_func(value):
 42        return value.capitalize()
 43
 44    @staticmethod
 45    def header_value_func(node):
 46        return node.value.capitalize()
 47
 48    @staticmethod
 49    def header_style_func(node):
 50        return html_style.td_style(
 51            text_align="center",
 52            background_color="#4F81BD",
 53            color="#FFFFFF",
 54            font_weight="bold",
 55            white_space="pre",
 56            padding="10px",
 57            border=1,
 58        )
 59
 60    @staticmethod
 61    def index_value_func(node):
 62        return node.value.capitalize()
 63
 64    @staticmethod
 65    def index_style_func(node):
 66        return html_style.td_style(
 67            text_align="center",
 68            background_color="#4F81BD",
 69            color="#FFFFFF",
 70            font_weight="bold",
 71            white_space="pre",
 72            padding="10px",
 73            border=1,
 74        )
 75
 76    @classmethod
 77    def render_html(cls):
 78        # load data
 79        df = sample_names_data()
 80        # build presentation model
 81        klass_ = HTMLExample2
 82        pm = tc.build_presentation_model(
 83            df=df,
 84            output_format="html",
 85            data_value_func=klass_.data_value_func(df),
 86            data_style_func=klass_.data_style_func(df),
 87            header_value_func=klass_.header_value_func,
 88            header_style_func=klass_.header_style_func,
 89            index_style_func=klass_.index_style_func,
 90            index_value_func=klass_.index_value_func,
 91            index_name_func=klass_.index_name_value_func,
 92        )
 93
 94        layout = [pm]
 95        html = htmlw.HTMLWriter.to_html(layout, border=1)
 96        output_fp = os.path.join(tempfile.gettempdir(), "example2.html")
 97        print("Writing to =", output_fp)
 98        with open(output_fp, "w") as f:
 99            f.write(html)
100
101
_images/html_example2.png

Example 3 - Simple DataFrame with Layouts

Demonstrates rendering dataframes with multi-hierarchical indices and mult-hierarchical columns

  1class HTMLExample3:
  2    """
  3    Demonstrate styling and rendering of multiple multi-hierarchical indexed dataframe
  4    into a html file
  5    """
  6
  7    @staticmethod
  8    def data_value_func(df):
  9        def _inner(idx, col):
 10            if col == "gender":
 11                if df.loc[idx, col] == "F":
 12                    return "Female"
 13                return "Male"
 14            return df.loc[idx, col]
 15
 16        return _inner
 17
 18    @staticmethod
 19    def data_style_func(df):
 20        def _inner(idx, col):
 21            color = "#FFFFFF"
 22            text_align = "left"
 23            if col == "count":
 24                text_align = "right"
 25            if df.loc[idx, "gender"] == "F":
 26                color = "#bbdef8"
 27            else:
 28                color = "#e3f2fd"
 29            return html_style.td_style(
 30                text_align=text_align,
 31                background_color=color,
 32                color="#000000",
 33                font_weight="normal",
 34                white_space="pre",
 35                padding="10px",
 36                border=None,
 37            )
 38
 39        return _inner
 40
 41    @staticmethod
 42    def index_name_value_func(value):
 43        return "Max By Year"
 44
 45    @staticmethod
 46    def header_value_func(node):
 47        return node.value.capitalize()
 48
 49    @staticmethod
 50    def header_style_func(node):
 51        return html_style.td_style(
 52            text_align="center",
 53            background_color="#4F81BD",
 54            color="#FFFFFF",
 55            font_weight="bold",
 56            white_space="pre",
 57            padding="10px",
 58            border=1,
 59        )
 60
 61    @staticmethod
 62    def index_value_func(node):
 63        if isinstance(node.value, str):
 64            return node.value.capitalize()
 65        return node.value
 66
 67    @staticmethod
 68    def index_style_func(node):
 69        return html_style.td_style(
 70            text_align="center",
 71            background_color="#4F81BD",
 72            color="#FFFFFF",
 73            font_weight="bold",
 74            white_space="pre",
 75            padding="10px",
 76            border=1,
 77        )
 78
 79    @classmethod
 80    def render_html(cls):
 81
 82        # Prepare first data frame (same as in render_xlsx)
 83        df = sample_names_data()
 84        # build presentation model
 85        klass_ = HTMLExample4
 86        pm_all = tc.build_presentation_model(
 87            df=df,
 88            output_format="html",
 89            data_value_func=klass_.data_value_func(df),
 90            data_style_func=klass_.data_style_func(df),
 91            header_value_func=klass_.header_value_func,
 92            header_style_func=klass_.header_style_func,
 93            index_style_func=klass_.index_style_func,
 94            index_value_func=klass_.index_value_func,
 95            index_name_func=lambda _: "Sample Data",
 96        )
 97
 98        male_df = top_names_for_year(gender="M")
 99        pm_top_male = tc.build_presentation_model(
100            df=male_df,
101            output_format="html",
102            data_value_func=klass_.data_value_func(male_df),
103            data_style_func=klass_.data_style_func(male_df),
104            header_value_func=klass_.header_value_func,
105            header_style_func=klass_.header_style_func,
106            index_style_func=klass_.index_style_func,
107            index_value_func=klass_.index_value_func,
108            index_name_func=lambda _: "Max by Year",
109        )
110
111        female_df = top_names_for_year(gender="F")
112        pm_top_female = tc.build_presentation_model(
113            df=female_df,
114            output_format="html",
115            data_value_func=klass_.data_value_func(female_df),
116            data_style_func=klass_.data_style_func(female_df),
117            header_value_func=klass_.header_value_func,
118            header_style_func=klass_.header_style_func,
119            index_style_func=klass_.index_style_func,
120            index_value_func=klass_.index_value_func,
121            index_name_func=lambda _: "Max by Year",
122        )
123
124        layout = [pm_all, [pm_top_female, pm_top_male]]
125        # render to xlsx
126        html = htmlw.HTMLWriter.to_html(layout, border=1, orientation="horizontal")
127        output_fp = os.path.join(tempfile.gettempdir(), "example3.html")
128        print("Writing to =", output_fp)
129        with open(output_fp, "w") as f:
130            f.write(html)
131
132
_images/html_example3.png

Example 4 - DataFrames with Multi-hierarchical columns and indices

Demonstrates rendering dataframes with multi-hierarchical indices and mult-hierarchical columns

  1class HTMLExample4:
  2    """
  3    Demonstrate styling and rendering of multi-hierarchical indexed dataframe
  4    into a html file.
  5    """
  6
  7    @staticmethod
  8    def data_value_func(df):
  9        def _inner(idx, col):
 10            if col == "gender":
 11                if df.loc[idx, col] == "F":
 12                    return "Female"
 13                return "Male"
 14            return df.loc[idx, col]
 15
 16        return _inner
 17
 18    @staticmethod
 19    def data_style_func(df):
 20        def _inner(idx, col):
 21            color = "#FFFFFF"
 22            text_align = "left"
 23            if col == "count":
 24                text_align = "right"
 25            if idx[1] == "F":
 26                color = "#bbdef8"
 27            else:
 28                color = "#e3f2fd"
 29
 30            return html_style.td_style(
 31                text_align=text_align,
 32                background_color=color,
 33                color="#000000",
 34                font_weight="normal",
 35                white_space="pre",
 36                padding="10px",
 37                border=None,
 38            )
 39
 40        return _inner
 41
 42    @staticmethod
 43    def index_name_value_func(value):
 44        return "Max By Year"
 45
 46    @staticmethod
 47    def header_value_func(node):
 48        return node.value.capitalize()
 49
 50    @staticmethod
 51    def header_style_func(node):
 52        return html_style.td_style(
 53            text_align="center",
 54            background_color="#4F81BD",
 55            color="#FFFFFF",
 56            font_weight="bold",
 57            white_space="pre",
 58            padding="10px",
 59            border=1,
 60        )
 61
 62    @staticmethod
 63    def index_value_func(node):
 64        if isinstance(node.value, str):
 65            return node.value.capitalize()
 66        return node.value
 67
 68    @staticmethod
 69    def index_style_func(node):
 70        return html_style.td_style(
 71            text_align="center",
 72            background_color="#4F81BD",
 73            color="#FFFFFF",
 74            font_weight="bold",
 75            white_space="pre",
 76            padding="10px",
 77            border=1,
 78        )
 79
 80    @classmethod
 81    def render_html(cls):
 82
 83        # Prepare first data frame (same as in render_xlsx)
 84        data_df = load_names_data()
 85        data_df = data_df[data_df["year"] >= 2000]
 86        g = data_df.groupby(("year", "gender"))
 87        df = g.max()
 88
 89        klass_ = cls
 90        pm = tc.build_presentation_model(
 91            df=df,
 92            output_format="html",
 93            data_value_func=klass_.data_value_func(df),
 94            data_style_func=klass_.data_style_func(df),
 95            header_value_func=klass_.header_value_func,
 96            header_style_func=klass_.header_style_func,
 97            index_style_func=klass_.index_style_func,
 98            index_value_func=klass_.index_value_func,
 99            index_name_func=klass_.index_name_value_func,
100        )
101
102        layout = [pm]
103        # render to xlsx
104        html = htmlw.HTMLWriter.to_html(layout, border=1)
105        output_fp = os.path.join(tempfile.gettempdir(), "example4.html")
106        print("Writing to =", output_fp)
107        with open(output_fp, "w") as f:
108            f.write(html)
109
110
_images/html_example4.png

API

Building the presentation model

table_compositor.table_compositor.build_presentation_model(*, df, output_format='xlsx', data_value_func=None, column_style_func=None, data_style_func=None, header_style_func=None, header_value_func=None, index_style_func=None, index_value_func=None, index_name_func=None, index_name_style_func=None, engine='openpyxl', **kwargs)[source]

Construct and return the presentation model that will be used while rendering to html/xlsx formats. The returned object has all the information required to render the tables in the requested format. The details of the object is transparent to the caller. It is only exposed for certain advanced operations.

Parameters
  • df – The dataframe representation of the table. The shape of the dataframe closely resembles the table that will be rendered in the requested format.

  • output_format – ‘html’ or ‘xlsx’

  • data_value_func – example: lambda idx, col: df.loc[idx, col], assuming df is in the closure. This can be None, if no data transformation is required to the values already present in the source df

  • column_style_func – the function can substitute the data_style_func, if the same style can be applied for the whole column. This argument should be prefered over the data_style_func argument. Using this option provides better performance since the fewer objects will be created internally and fewer callbacks are made to this function when compared to data_style_func.This argument only applies to the data contained in the dataframe and not the cell where the headers are rendered. For fine grained control at cell level, the data_style_func argument can be used. For more information on return values of this function, refer to the documentation for data_style_func argument.

  • data_style_func – used to provide style at the cell level. Example: lambda idx, col: return dict(font=Font(…)), where Font is the openpyxl object and font is the attr available in the cell instance of openpyxl. For xlsx, the keys in the dict are the attrs of the cell object in openpyxl and the values correspond to the value of that attribute. Example are found in xlsx_styles module. For html, the key-value pairs are any values that go into to the style attribute of a td, th cell in html. Examples are found in html_styles module. example: dict(background-color=’#F8F8F8’). When performance becomes an issue, and cell level control is not needed, it is recommended to use the column_style_func argument rathat than this argument. If the prefered engine is XlswWriter, then the style dictionary returned should have key/values compatible with the Format object declarted in the XlsxWriter library. A reference can be found in ``xlsx_styles.XlsxWriterStyleHelper` class

  • header_value_func – func that takes a object of type IndexNode. The IndexNode contains the attributes that refer to the header being rendered. The returned value from this function is displayed in place of the header in the dataframe at the location. The two properties available on the IndexNode object are value and key. The key is useful to identify the exact index and level in context while working with multi-hierarchical columns.

  • header_style_func – func that takes a object of type IndexNode. The return value of this function is similar to data_style_func.

  • index_value_func – func that takes a object of type IndexNode. The IndexNode contains the attributes that refer to the index being rendered. The returned value from this function is displayed in place of the index in the dataframe at the location.

  • index_style_func – func that takes a object of type IndexNode. The return value of this function is similar to data_style_func.

  • index_name_func – func that returns a string for index name (value to be displayed on top-left corner, above the index column)

  • index_name_style – the style value same as data_style_func that will be used to style the cell

  • engine – required while building presentation model for xlsx. Argument ignored for HTML rendering. This argument is used to provide the default callback style functions, where the style dictionary returned by the callback functions should be compatible with the engine being used.

  • kwargs

    ‘hide_index’ - if True, then hide the index column, default=False

    ’hide_header, - if True, then hide the header, default=False

    ’use_convert’ - if True, do some conversions from dataframe values to values excel can understand for example np.NaN are converted to NaN strings

Returns

A presentation model, to be used to create layout and provide the layout to the html or xlsx writers.

About the callback functions provided as arguments:

Note that callback function provided as arguments to this function are provided with either a tuple of index, col arguments are some information regarding the index or headers being rendered. Therefore, a common pattern would be to capture the dataframe being rendered in a closure of this callback func before passing them as arugments.

For example:

df = pd.DataFrame(dict(a=[1, 2, 3]))

def data_value_func():
def _inner(idx, col):

return df.loc[idx, col] * 10.3

return _inner

pm = build_presentation_model(df=df, data_value_func=data_value_func())

Rendering to XLSX

Rendering to HTML

class table_compositor.html_writer.HTMLWriter[source]
static to_html(layout, orientation='vertical', **kwargs)[source]

Take a layout which contains a list of presentation models builts using the build_presentation_model function.

Parameters
  • layout – An nested list of presentation_models, examples: [presentation_model] or [presentation_model1, presentation_mode2]. Not all nested layouts work very well in HTML, currently

  • orientation – if vertical, the top level presentation model elements are rendered vertically, and for every nested level the orientation is flipped. if horizontal, then the behavior is inverse

  • kwargs – all key-value pairs available in kwargs are directly set as value of the style attribute of table tag. example dict(backgroud-color=’#FF88FF’), is used as <table style=’background-color:#FF88FF’>..</table>

Returns

Return a HTML formatted string. The outermost tag of the returned string is the <table>

Helper XLSX Styles

class table_compositor.xlsx_styles.OpenPyxlStyleHelper[source]
static default_header_style(*, alignment='center', font=<openpyxl.styles.fonts.Font object> Parameters: name=None, charset=None, family=None, b=True, i=False, strike=None, outline=None, shadow=None, condense=None, color=<openpyxl.styles.colors.Color object> Parameters: rgb='00FFFFFF', indexed=None, auto=None, theme=None, tint=0.0, type='rgb', extend=None, sz=None, u=None, vertAlign=None, scheme=None, bgColor='4F81BD', border=<object object>)[source]

Provides styles for default headers for OpenPyxl engine

Parameters
  • alignment – ‘center’, ‘left’, ‘right’ used for horizontal alignment

  • font – an openpyxl.Font instance

  • bgColor – hex color that will be used as background color in the fill pattern

  • border – an openpyxl.Border instance, defaults to thin white border

Returns

A dict of key-values pairs, where each key is a attr of the cell object in openyxl and value is valid value of that attr.

static get_style(number_format='General', bg_color=None, border=None, font=None)[source]

Helper method to return a openpyxl Style

Parameters
  • number_format – an xlsx compatibale number format string

  • bg_color – hex color that will be used as background color in the fill pattern

  • border – an openpyxl.Border instance, defaults to thin white border

Returns

A dict of key-values pairs, where each key is a attr of the cell object in openyxl and value is valid value of that attr.

class table_compositor.xlsx_styles.XlsxWriterStyleHelper[source]

Class provides style objects for XlsxWriter library uses to render xlsx files

static default_header_style(*, number_format='General', alignment='center', font=None, bgColor='#4F81BD', border=<object object>)[source]

Provides styles for default headers for XlsxWriter engine

Parameters
  • alignment – ‘center’, ‘left’, ‘right’ used for horizontal alignment

  • font – an openpyxl.Font instance

  • bgColor – hex color that will be used as background color in the fill pattern

  • border – an openpyxl.Border instance, defaults to thin white border

Returns

A dict of key-values pairs, where each key is a attr of the cell object in openyxl and value is valid value of that attr.

static get_style(number_format='General', bg_color=None, border=<object object>)[source]

Helper method to return Style dictionary for XlsxWriter engine

Parameters
  • number_format – an xlsx compatibale number format string

  • bg_color – hex color that will be used as background color in the fill pattern

  • border – an openpyxl.Border instance, defaults to thin white border

Returns

A dict of key-values pairs, where each key/value is compatible with the Format object in XlsxWriter library.

class table_compositor.xlsx_styles.XLSXWriterDefaults[source]

Class provides defaults callback funcs that can be used while calling the build_presentation_model.

static data_style_func(df)[source]

Default value that can be used as callback for data_style_func

Parameters

df – the dataframe that will be used to build the presentation model

Returns

a function table takes idx, col as arguments and returns a openpyxl compatible style dictionary

static data_value_func(df)[source]

Default value that can be used as callback for data_value_func

Parameters

df – the dataframe that will be used to build the presentation model

Returns

A function that takes idx, col as arguments and returns the df.loc[idx, col] value

static header_style_func(df)[source]

Default value that can be used as callback for data_style_func

Parameters

df – the dataframe that will be used to build the presentation model

Returns

a function table takes node as arguments and returns a openpyxl compatible style dictionary

static header_value_func(df)[source]

Default value that can be used as callback for data_header_func

Parameters

df – the dataframe that will be used to build the presentation model

Returns

a function table takes node as arguments and returns node.value

static index_name_style_func(df)[source]

Default value that can be used as callback for index_name_style_func

Parameters

df – the dataframe that will be used to build the presentation model

Returns

a function table takes index.name as arguments and returns a openpyxl compatible style dictionary

static index_name_value_func(df)[source]

Default value that can be used as callback for index_name_value_func

Parameters

df – the dataframe that will be used to build the presentation model

Returns

a function table takes index.name as arguments and returns index.name if not None, else ‘’

static index_style_func(df)[source]

Default value that can be used as callback for index_style_func

Parameters

df – the dataframe that will be used to build the presentation model

Returns

a function table takes node as arguments and returns a openpyxl compatible style dictionary

static index_value_func(df)[source]

Default value that can be used as callback for index_header_func

Parameters

df – the dataframe that will be used to build the presentation model

Returns

a function table takes node as arguments and returns node.value

Helper HTML Styles

class table_compositor.html_styles.HTMLWriterDefaults[source]

Class provides defaults callback funcs that can be used while calling the build_presentation_model.

static data_style_func(df)[source]

Default value that can be used as callback for data_style_func

Parameters

df – the dataframe that will be used to build the presentation model

Returns

a function table takes idx, col as arguments and returns a dictionary of html style attributes

static data_value_func(df, dollar_columns=None)[source]

Default value that can be used as callback for data_value_func

Parameters

df – the dataframe that will be used to build the presentation model

Returns

A function that takes idx, col as arguments and returns the df.loc[idx, col] value

static header_style_func(df)[source]

Default value that can be used as callback for header_style_func

Parameters

df – the dataframe that will be used to build the presentation model

Returns

a function table takes node as argument and returns a dictionary of html style attributes

static header_value_func(df)[source]

Default value that can be used as callback for header_value_func

Parameters

df – the dataframe that will be used to build the presentation model

Returns

A function that takes node as arguments and returns the node.value

static index_name_style_func(df)[source]

Default value that can be used as callback for index_name_style_func

Parameters

df – the dataframe that will be used to build the presentation model

Returns

a function table takes index.name as argument and returns a dictionary of html style attributes

static index_name_value_func(df)[source]

Default value that can be used as callback for index_name_value_func

Parameters

df – the dataframe that will be used to build the presentation model

Returns

A function that takes index.name as argument and return index.name if not None else ‘’

static index_style_func(df)[source]

Default value that can be used as callback for index_style_func

Parameters

df – the dataframe that will be used to build the presentation model

Returns

a function table takes node as argument and returns a dictionary of html style attributes

static index_value_func(df)[source]

Default value that can be used as callback for index_value_func

Parameters

df – the dataframe that will be used to build the presentation model

Returns

A function that takes node as arguments and returns the node.value

Code Used in documentation

Basic Usage

# start_imports
import os
import tempfile

import pandas as pd

from table_compositor.table_compositor import build_presentation_model
from table_compositor.xlsx_styles import OpenPyxlStyleHelper

# There are equivalent classes for using xlsxwriter library. Namely,
# XlsxWriterCompositor and XlsxWriterStyleHelper
from table_compositor.xlsx_writer import OpenPyxlCompositor

# end_imports

# start_basic_example_2
def basic_example2():

    df = pd.DataFrame(
        dict(a=[10, 20, 30, 40, 50], b=[0.1, 0.9, 0.2, 0.6, 0.3]), index=[1, 2, 3, 4, 5]
    )

    def style_func(idx, col):
        if col == "b":
            return OpenPyxlStyleHelper.get_style(number_format="0.00%")
        else:
            # for 'a' we do dollar format
            return OpenPyxlStyleHelper.get_style(number_format="$#,##.00")

    # create a presentation model
    # note the OpenPyxlStyleHelper function available in xlsx_styles module. But a return value of style function
    # can be any dict whose keys are attributes of the OpenPyxl cell object.
    presentation_model = build_presentation_model(
        df=df,
        data_value_func=lambda idx, col: df.loc[idx, col] * 10
        if col == "a"
        else df.loc[idx, col],
        data_style_func=style_func,
        header_value_func=lambda node: node.value.capitalize(),
        header_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
        index_value_func=lambda node: node.value * 100,
        index_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
        index_name_func=lambda _: "Basic Example",
        index_name_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
    )

    # create a layout, which is usually a nested list of presentation models
    layout = [presentation_model]

    # render to xlsx
    output_fp = os.path.join(tempfile.gettempdir(), "basic_example2.xlsx")
    OpenPyxlCompositor.to_xlsx(layout=layout, output_fp=output_fp)


# end_basic_example_2

# start_basic_example_3
def basic_example3():

    df = pd.DataFrame(
        dict(
            a=[10, 20, 30, 40],
            b=[0.1, 0.9, 0.2, 0.6],
            d=[50, 60, 70, 80],
            e=[200, 300, 400, 500],
        )
    )
    df.columns = pd.MultiIndex.from_tuples(
        [("A", "x"), ("A", "y"), ("B", "x"), ("B", "y")]
    )
    df.index = pd.MultiIndex.from_tuples([(1, 100), (1, 200), (2, 100), (2, 200)])
    print(df)

    def index_style_func(node):
        # node.key here could be one of (1,), (1, 100), (2,), (2, 100), (2, 200)
        bg_color = "FFFFFF"
        if node.key == (1,) or node.key == (2,):
            bg_color = "9E80B8"
        elif node.key[1] == 100:
            bg_color = "4F90C1"
        elif node.key[1] == 200:
            bg_color = "6DC066"
        return OpenPyxlStyleHelper.get_style(bg_color=bg_color)

    def header_style_func(node):
        bg_color = "FFFFFF"
        if node.key == ("A",) or node.key == ("B",):
            bg_color = "9E80B8"
        elif node.key[1] == "x":
            bg_color = "4F90C1"
        elif node.key[1] == "y":
            bg_color = "6DC066"
        return OpenPyxlStyleHelper.get_style(bg_color=bg_color)

    # create a presentation model
    # note the OpenPyxlStyleHeloer function available in xlsx_styles module. But a return value of style function
    # can be any dict whose keys are attributes of the OpenPyxl cell object.
    presentation_model = build_presentation_model(
        df=df,
        index_style_func=index_style_func,
        header_style_func=header_style_func,
        index_name_func=lambda _: "Multi-Hierarchy Example",
    )

    # create a layout, which is usually a nested list of presentation models
    layout = [presentation_model]

    # render to xlsx
    output_fp = os.path.join(tempfile.gettempdir(), "basic_example3.xlsx")
    OpenPyxlCompositor.to_xlsx(layout=layout, output_fp=output_fp)


# end_basic_example_3


# start_layout_example_1
def layout_example1():

    df = pd.DataFrame(
        dict(a=[10, 20, 30, 40, 50], b=[0.1, 0.9, 0.2, 0.6, 0.3]), index=[1, 2, 3, 4, 5]
    )

    def style_func(idx, col):
        if col == "b":
            return OpenPyxlStyleHelper.get_style(number_format="0.00%")
        else:
            # for 'a' we do dollar format
            return OpenPyxlStyleHelper.get_style(number_format="$#,##.00")

    # create a presentation model
    # note the OpenPyxlStyleHeloer function available in xlsx_styles module. But a return value of style function
    # can be any dict whose keys are attributes of the OpenPyxl cell object.
    presentation_model = build_presentation_model(
        df=df,
        data_value_func=lambda idx, col: df.loc[idx, col] * 10
        if col == "a"
        else df.loc[idx, col],
        data_style_func=style_func,
        header_value_func=lambda node: node.value.capitalize(),
        header_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
        index_value_func=lambda node: node.value * 100,
        index_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
        index_name_func=lambda _: "Basic Example",
        index_name_style_func=lambda _: OpenPyxlStyleHelper.default_header_style(),
    )

    # start_layout_code_1
    # create a layout, which is usually a nested list of presentation models
    layout = [[presentation_model], [[presentation_model], [presentation_model]]]

    # render to xlsx
    output_fp = os.path.join(tempfile.gettempdir(), "layout_vertical_example1.xlsx")
    # the default value for orientation is 'vertical'
    OpenPyxlCompositor.to_xlsx(
        layout=layout, output_fp=output_fp, orientation="vertical"
    )

    output_fp = os.path.join(tempfile.gettempdir(), "layout_horizontal_example1.xlsx")
    OpenPyxlCompositor.to_xlsx(
        layout=layout, output_fp=output_fp, orientation="horizontal"
    )
    print("Writing xlsx file=", output_fp)

    # mutiple nesting
    layout_complex = [
        presentation_model,
        [presentation_model, [presentation_model, presentation_model]],
    ]

    output_fp = os.path.join(tempfile.gettempdir(), "layout_complex_example1.xlsx")
    OpenPyxlCompositor.to_xlsx(
        layout=layout_complex, output_fp=output_fp, orientation="vertical"
    )
    print("Writing xlsx file=", output_fp)
    # end_layout_code_1


# end_layout_example_1


if __name__ == "__main__":
    basic_example2()
    basic_example3()
    layout_example1()

XLSX Examples

"""
This module is referred to by the Sphinx documentation. If you need to run this
module, install table_compositor in an separate  environment and then run this module
in that environment. This helps the imports find the modules in the right place
"""

import collections
import os

# start_imports
import tempfile
import webbrowser
import zipfile

import pandas as pd
import requests

import table_compositor.table_compositor as tc
import table_compositor.xlsx_styles as xlsstyle
import table_compositor.xlsx_writer as xlsxw

# end_imports

# start_data_routine
# code snippet adapted from http://function-pipe.readthedocs.io/en/latest/usage_df.html
# source url
URL_NAMES = "https://www.ssa.gov/oact/babynames/names.zip"
ZIP_NAME = "names.zip"


def load_names_data():
    fp = os.path.join(tempfile.gettempdir(), ZIP_NAME)
    if not os.path.exists(fp):
        r = requests.get(URL_NAMES)
        with open(fp, "wb") as f:
            f.write(r.content)

    post = collections.OrderedDict()
    with zipfile.ZipFile(fp) as zf:
        # get ZipInfo instances
        for zi in sorted(zf.infolist(), key=lambda zi: zi.filename):
            fn = zi.filename
            if fn.startswith("yob"):
                year = int(fn[3:7])
                df = pd.read_csv(
                    zf.open(zi), header=None, names=("name", "gender", "count")
                )
                df["year"] = year
                post[year] = df

        df = pd.concat(post.values())
        df.set_index("name", inplace=True, drop=True)
        return df


def sample_names_data():
    df = load_names_data()
    df = df[(df["year"] == 2015) & (df["count"] > 1000)]
    return df.sample(100, random_state=0).sort_values("count")


def top_names_for_year(year=2015, gender="F", top_n=5):
    df = load_names_data()
    df = df[(df["year"] == year) & (df["gender"] == gender)]
    df = df.sort_values("count")[:top_n]
    return df


# end_data_routine

# start_XLSXExample1
class XLSXExample1:
    """
    Demonstrates rendering a simple dataframe to a xlsx file
    using the default styles
    """

    @classmethod
    def render_xlsx(cls):
        """
        Render the df to a xlsx file.
        """

        # load data
        df = sample_names_data()
        # build presentation model
        pm = tc.build_presentation_model(df=df, output_format="xlsx")

        # render to xlsx
        tempdir = tempfile.gettempdir()
        fp = os.path.join(tempdir, "example1.xlsx")
        layout = [pm]
        print("Writing to " + fp)
        xlsxw.OpenPyxlCompositor.to_xlsx(layout=layout, output_fp=fp)


# end_XLSXExample1


# start_XLSXExample2
class XLSXExample2:
    """
    Demonstrates using call-backs that help set the display and style
    properties of each cell in the xlsx sheet.
    """

    @staticmethod
    def data_value_func(df):
        def _inner(idx, col):
            if col == "gender":
                if df.loc[idx, col] == "F":
                    return "Female"
                return "Male"
            return df.loc[idx, col]

        return _inner

    @staticmethod
    def data_style_func(df):
        def _inner(idx, col):
            bg_color = None
            number_format = "General"
            if col == "count":
                number_format = "#,##0"
            if df.loc[idx, "gender"] == "F":
                bg_color = "bbdef8"
            else:
                bg_color = "e3f2fd"
            return xlsstyle.OpenPyxlStyleHelper.get_style(
                bg_color=bg_color, number_format=number_format
            )

        return _inner

    @staticmethod
    def index_name_value_func(value):
        return value.capitalize()

    @staticmethod
    def index_name_style_func(value):
        return xlsstyle.OpenPyxlStyleHelper.default_header_style()

    @staticmethod
    def header_value_func(node):
        return node.value.capitalize()

    @staticmethod
    def header_style_func(node):
        return xlsstyle.OpenPyxlStyleHelper.default_header_style()

    @staticmethod
    def index_value_func(node):
        return node.value.capitalize()

    @staticmethod
    def index_style_func(df):
        def _inner(node):
            bg_color = None
            if df.loc[node.value, "gender"] == "F":
                bg_color = "bbdef8"
            else:
                bg_color = "e3f2fd"
            return xlsstyle.OpenPyxlStyleHelper.get_style(bg_color=bg_color)

        return _inner

    @classmethod
    def render_xlsx(cls):
        # load data
        df = sample_names_data()
        # build presentation model
        klass_ = XLSXExample2
        pm = tc.build_presentation_model(
            df=df,
            output_format="xlsx",
            data_value_func=klass_.data_value_func(df),
            data_style_func=klass_.data_style_func(df),
            header_value_func=klass_.header_value_func,
            header_style_func=klass_.header_style_func,
            index_style_func=klass_.index_style_func(df),
            index_value_func=klass_.index_value_func,
            index_name_style_func=klass_.index_name_style_func,
            index_name_func=klass_.index_name_value_func,
        )

        # render to xlsx
        tempdir = tempfile.gettempdir()
        fp = os.path.join(tempdir, "example2.xlsx")
        layout = [pm]
        print("Writing to " + fp)
        xlsxw.OpenPyxlCompositor.to_xlsx(layout=layout, output_fp=fp)


# end_XLSXExample2

# start_XLSXExample3
class XLSXExample3:
    """
    Demonstrates using call-backs and also rendering multiple tables to single
    worksheet.
    """

    @staticmethod
    def data_value_func(df):
        def _inner(idx, col):
            if col == "gender":
                if df.loc[idx, col] == "F":
                    return "Female"
                return "Male"
            return df.loc[idx, col]

        return _inner

    @staticmethod
    def data_style_func(df):
        def _inner(idx, col):
            bg_color = None
            number_format = "General"
            if col == "count":
                number_format = "#,##0"
            if df.loc[idx, "gender"] == "F":
                bg_color = "bbdef8"
            else:
                bg_color = "e3f2fd"
            return xlsstyle.OpenPyxlStyleHelper.get_style(
                bg_color=bg_color, number_format=number_format
            )

        return _inner

    @staticmethod
    def index_name_value_func(value):
        return value.capitalize()

    @staticmethod
    def index_name_style_func(value):
        return xlsstyle.OpenPyxlStyleHelper.default_header_style()

    @staticmethod
    def header_value_func(node):
        return node.value.capitalize()

    @staticmethod
    def header_style_func(node):
        return xlsstyle.OpenPyxlStyleHelper.default_header_style()

    @staticmethod
    def index_value_func(node):
        return node.value.capitalize()

    @staticmethod
    def index_style_func(df):
        def _inner(node):
            bg_color = None
            if df.loc[node.value, "gender"] == "F":
                bg_color = "bbdef8"
            else:
                bg_color = "e3f2fd"
            return xlsstyle.OpenPyxlStyleHelper.get_style(bg_color=bg_color)

        return _inner

    @classmethod
    def render_xlsx(cls):
        # Prepare first data frame (same as in render_xlsx)
        df = sample_names_data()
        # build presentation model
        klass_ = XLSXExample3
        pm_all = tc.build_presentation_model(
            df=df,
            output_format="xlsx",
            data_value_func=klass_.data_value_func(df),
            data_style_func=klass_.data_style_func(df),
            header_value_func=klass_.header_value_func,
            header_style_func=klass_.header_style_func,
            index_style_func=klass_.index_style_func(df),
            index_value_func=klass_.index_value_func,
            index_name_style_func=klass_.index_name_style_func,
            index_name_func=klass_.index_name_value_func,
        )

        male_df = top_names_for_year(gender="M")
        pm_top_male = tc.build_presentation_model(
            df=male_df,
            output_format="xlsx",
            data_value_func=klass_.data_value_func(male_df),
            data_style_func=klass_.data_style_func(male_df),
            header_value_func=klass_.header_value_func,
            header_style_func=klass_.header_style_func,
            index_style_func=klass_.index_style_func(male_df),
            index_value_func=klass_.index_value_func,
            index_name_style_func=klass_.index_name_style_func,
            index_name_func=klass_.index_name_value_func,
        )

        female_df = top_names_for_year(gender="F")
        pm_top_female = tc.build_presentation_model(
            df=female_df,
            output_format="xlsx",
            data_value_func=klass_.data_value_func(female_df),
            data_style_func=klass_.data_style_func(female_df),
            header_value_func=klass_.header_value_func,
            header_style_func=klass_.header_style_func,
            index_style_func=klass_.index_style_func(female_df),
            index_value_func=klass_.index_value_func,
            index_name_style_func=klass_.index_name_style_func,
            index_name_func=klass_.index_name_value_func,
        )

        layout = [pm_all, [pm_top_female, pm_top_male]]
        # render to xlsx
        tempdir = tempfile.gettempdir()
        fp = os.path.join(tempdir, "example3.xlsx")
        print("Writing to " + fp)
        xlsxw.OpenPyxlCompositor.to_xlsx(
            layout=layout, output_fp=fp, orientation="horizontal"
        )


# end_XLSXExample3

# start_XLSXExample4
class XLSXExample4:
    """
    Demonstrate styling and rendering of multi-hierarchical indexed dataframe
    into a xlsx file.
    """

    @staticmethod
    def data_style_func(df):
        def _inner(idx, col):
            bg_color = None
            number_format = "General"
            if col == "count":
                number_format = "#,##0"
            if idx[1] == "F":
                bg_color = "bbdef8"
            else:
                bg_color = "e3f2fd"
            return xlsstyle.OpenPyxlStyleHelper.get_style(
                bg_color=bg_color, number_format=number_format
            )

        return _inner

    @staticmethod
    def index_name_value_func(value):
        return "Max By Year"

    @staticmethod
    def index_name_style_func(value):
        return xlsstyle.OpenPyxlStyleHelper.default_header_style()

    @staticmethod
    def header_value_func(node):
        return node.value.capitalize()

    @staticmethod
    def header_style_func(node):
        return xlsstyle.OpenPyxlStyleHelper.default_header_style()

    @staticmethod
    def index_value_func(node):
        if isinstance(node.value, str):
            return node.value.capitalize()
        return node.value

    @staticmethod
    def index_style_func(df):
        def _inner(node):
            bg_color = None
            if len(node.key) == 1:
                bg_color = "4f81bd"
            elif node.key[1] == "F":
                bg_color = "bbdef8"
            else:
                bg_color = "e3f2fd"
            return xlsstyle.OpenPyxlStyleHelper.get_style(bg_color=bg_color)

        return _inner

    @classmethod
    def render_xlsx(cls):

        # Prepare first data frame (same as in render_xlsx)
        data_df = load_names_data()
        data_df = data_df[data_df["year"] >= 2000]
        g = data_df.groupby(("year", "gender"))
        df = g.max()

        klass_ = cls
        pm = tc.build_presentation_model(
            df=df,
            output_format="xlsx",
            # data_value_func=None,   # use default
            data_style_func=klass_.data_style_func(df),
            header_value_func=klass_.header_value_func,
            header_style_func=klass_.header_style_func,
            index_style_func=klass_.index_style_func(df),
            index_value_func=klass_.index_value_func,
            index_name_style_func=klass_.index_name_style_func,
            index_name_func=klass_.index_name_value_func,
        )

        layout = [pm]
        # render to xlsx
        tempdir = tempfile.gettempdir()
        fp = os.path.join(tempdir, "example4.xlsx")
        print("Writing to " + fp)
        xlsxw.OpenPyxlCompositor.to_xlsx(
            layout=layout, output_fp=fp, orientation="horizontal"
        )


# end_XLSXExample4


def main():
    XLSXExample1.render_xlsx()
    XLSXExample2.render_xlsx()
    XLSXExample3.render_xlsx()
    XLSXExample4.render_xlsx()


if __name__ == "__main__":
    main()

HTML Examples

"""
This module is referred to by the Sphinx documentation. If you need to run this
module, install table_compositor in an separate  environment and then run this module
in that environment. This helps the imports find the modules in the right place
"""


import collections
import os

# start_imports
import tempfile
import webbrowser
import zipfile

import pandas as pd
import requests

import table_compositor.html_styles as html_style
import table_compositor.html_writer as htmlw
import table_compositor.table_compositor as tc

# end_imports

# start_data_routine
# code snippet adapted from http://function-pipe.readthedocs.io/en/latest/usage_df.html
# source url
URL_NAMES = "https://www.ssa.gov/oact/babynames/names.zip"
ZIP_NAME = "names.zip"


def load_names_data():
    fp = os.path.join(tempfile.gettempdir(), ZIP_NAME)
    if not os.path.exists(fp):
        r = requests.get(URL_NAMES)
        with open(fp, "wb") as f:
            f.write(r.content)

    post = collections.OrderedDict()
    with zipfile.ZipFile(fp) as zf:
        # get ZipInfo instances
        for zi in sorted(zf.infolist(), key=lambda zi: zi.filename):
            fn = zi.filename
            if fn.startswith("yob"):
                year = int(fn[3:7])
                df = pd.read_csv(
                    zf.open(zi), header=None, names=("name", "gender", "count")
                )
                df["year"] = year
                post[year] = df

        df = pd.concat(post.values())
        df.set_index("name", inplace=True, drop=True)
        return df


def sample_names_data():
    df = load_names_data()
    df = df[(df["year"] == 2015) & (df["count"] > 1000)]
    return df.sample(50, random_state=0).sort_values("count")


def top_names_for_year(year=2015, gender="F", top_n=5):
    df = load_names_data()
    df = df[(df["year"] == year) & (df["gender"] == gender)]
    df = df.sort_values("count")[:top_n]
    return df


# end_data_routine

# start_HTMLExample1
class HTMLExample1:
    """
    Demonstrate rendering of a simple dataframe into html
    """

    @classmethod
    def render_html(cls):

        # load data
        df = load_names_data()
        df = df[:100]

        # build presentation model
        pm = tc.build_presentation_model(df=df, output_format="html")

        # render to xlsx
        tempdir = tempfile.gettempdir()
        fp = os.path.join(tempdir, "example_1.html")
        layout = [pm]
        print("Writing to " + fp)
        html = htmlw.HTMLWriter.to_html(layout, border=1)
        output_fp = os.path.join(tempfile.gettempdir(), "example1.html")
        with open(output_fp, "w") as f:
            f.write(html)


# end_HTMLExample1

# start_HTMLExample2
class HTMLExample2:
    """
    Demonstrate rendering of a simple dataframe into html
    """

    @staticmethod
    def data_value_func(df):
        def _inner(idx, col):
            if col == "gender":
                if df.loc[idx, col] == "F":
                    return "Female"
                return "Male"
            return df.loc[idx, col]

        return _inner

    @staticmethod
    def data_style_func(df):
        def _inner(idx, col):
            color = "#FFFFFF"
            text_align = "left"
            if col == "count":
                text_align = "right"
            if df.loc[idx, "gender"] == "F":
                color = "#bbdef8"
            else:
                color = "#e3f2fd"
            return html_style.td_style(
                text_align=text_align,
                background_color=color,
                color="#000000",
                font_weight="normal",
                white_space="pre",
                padding="10px",
                border=None,
            )

        return _inner

    @staticmethod
    def index_name_value_func(value):
        return value.capitalize()

    @staticmethod
    def header_value_func(node):
        return node.value.capitalize()

    @staticmethod
    def header_style_func(node):
        return html_style.td_style(
            text_align="center",
            background_color="#4F81BD",
            color="#FFFFFF",
            font_weight="bold",
            white_space="pre",
            padding="10px",
            border=1,
        )

    @staticmethod
    def index_value_func(node):
        return node.value.capitalize()

    @staticmethod
    def index_style_func(node):
        return html_style.td_style(
            text_align="center",
            background_color="#4F81BD",
            color="#FFFFFF",
            font_weight="bold",
            white_space="pre",
            padding="10px",
            border=1,
        )

    @classmethod
    def render_html(cls):
        # load data
        df = sample_names_data()
        # build presentation model
        klass_ = HTMLExample2
        pm = tc.build_presentation_model(
            df=df,
            output_format="html",
            data_value_func=klass_.data_value_func(df),
            data_style_func=klass_.data_style_func(df),
            header_value_func=klass_.header_value_func,
            header_style_func=klass_.header_style_func,
            index_style_func=klass_.index_style_func,
            index_value_func=klass_.index_value_func,
            index_name_func=klass_.index_name_value_func,
        )

        layout = [pm]
        html = htmlw.HTMLWriter.to_html(layout, border=1)
        output_fp = os.path.join(tempfile.gettempdir(), "example2.html")
        print("Writing to =", output_fp)
        with open(output_fp, "w") as f:
            f.write(html)


# end_HTMLExample2


# start_HTMLExample3
class HTMLExample3:
    """
    Demonstrate styling and rendering of multiple multi-hierarchical indexed dataframe
    into a html file
    """

    @staticmethod
    def data_value_func(df):
        def _inner(idx, col):
            if col == "gender":
                if df.loc[idx, col] == "F":
                    return "Female"
                return "Male"
            return df.loc[idx, col]

        return _inner

    @staticmethod
    def data_style_func(df):
        def _inner(idx, col):
            color = "#FFFFFF"
            text_align = "left"
            if col == "count":
                text_align = "right"
            if df.loc[idx, "gender"] == "F":
                color = "#bbdef8"
            else:
                color = "#e3f2fd"
            return html_style.td_style(
                text_align=text_align,
                background_color=color,
                color="#000000",
                font_weight="normal",
                white_space="pre",
                padding="10px",
                border=None,
            )

        return _inner

    @staticmethod
    def index_name_value_func(value):
        return "Max By Year"

    @staticmethod
    def header_value_func(node):
        return node.value.capitalize()

    @staticmethod
    def header_style_func(node):
        return html_style.td_style(
            text_align="center",
            background_color="#4F81BD",
            color="#FFFFFF",
            font_weight="bold",
            white_space="pre",
            padding="10px",
            border=1,
        )

    @staticmethod
    def index_value_func(node):
        if isinstance(node.value, str):
            return node.value.capitalize()
        return node.value

    @staticmethod
    def index_style_func(node):
        return html_style.td_style(
            text_align="center",
            background_color="#4F81BD",
            color="#FFFFFF",
            font_weight="bold",
            white_space="pre",
            padding="10px",
            border=1,
        )

    @classmethod
    def render_html(cls):

        # Prepare first data frame (same as in render_xlsx)
        df = sample_names_data()
        # build presentation model
        klass_ = HTMLExample4
        pm_all = tc.build_presentation_model(
            df=df,
            output_format="html",
            data_value_func=klass_.data_value_func(df),
            data_style_func=klass_.data_style_func(df),
            header_value_func=klass_.header_value_func,
            header_style_func=klass_.header_style_func,
            index_style_func=klass_.index_style_func,
            index_value_func=klass_.index_value_func,
            index_name_func=lambda _: "Sample Data",
        )

        male_df = top_names_for_year(gender="M")
        pm_top_male = tc.build_presentation_model(
            df=male_df,
            output_format="html",
            data_value_func=klass_.data_value_func(male_df),
            data_style_func=klass_.data_style_func(male_df),
            header_value_func=klass_.header_value_func,
            header_style_func=klass_.header_style_func,
            index_style_func=klass_.index_style_func,
            index_value_func=klass_.index_value_func,
            index_name_func=lambda _: "Max by Year",
        )

        female_df = top_names_for_year(gender="F")
        pm_top_female = tc.build_presentation_model(
            df=female_df,
            output_format="html",
            data_value_func=klass_.data_value_func(female_df),
            data_style_func=klass_.data_style_func(female_df),
            header_value_func=klass_.header_value_func,
            header_style_func=klass_.header_style_func,
            index_style_func=klass_.index_style_func,
            index_value_func=klass_.index_value_func,
            index_name_func=lambda _: "Max by Year",
        )

        layout = [pm_all, [pm_top_female, pm_top_male]]
        # render to xlsx
        html = htmlw.HTMLWriter.to_html(layout, border=1, orientation="horizontal")
        output_fp = os.path.join(tempfile.gettempdir(), "example3.html")
        print("Writing to =", output_fp)
        with open(output_fp, "w") as f:
            f.write(html)


# end_HTMLExample3

# start_HTMLExample4
class HTMLExample4:
    """
    Demonstrate styling and rendering of multi-hierarchical indexed dataframe
    into a html file.
    """

    @staticmethod
    def data_value_func(df):
        def _inner(idx, col):
            if col == "gender":
                if df.loc[idx, col] == "F":
                    return "Female"
                return "Male"
            return df.loc[idx, col]

        return _inner

    @staticmethod
    def data_style_func(df):
        def _inner(idx, col):
            color = "#FFFFFF"
            text_align = "left"
            if col == "count":
                text_align = "right"
            if idx[1] == "F":
                color = "#bbdef8"
            else:
                color = "#e3f2fd"

            return html_style.td_style(
                text_align=text_align,
                background_color=color,
                color="#000000",
                font_weight="normal",
                white_space="pre",
                padding="10px",
                border=None,
            )

        return _inner

    @staticmethod
    def index_name_value_func(value):
        return "Max By Year"

    @staticmethod
    def header_value_func(node):
        return node.value.capitalize()

    @staticmethod
    def header_style_func(node):
        return html_style.td_style(
            text_align="center",
            background_color="#4F81BD",
            color="#FFFFFF",
            font_weight="bold",
            white_space="pre",
            padding="10px",
            border=1,
        )

    @staticmethod
    def index_value_func(node):
        if isinstance(node.value, str):
            return node.value.capitalize()
        return node.value

    @staticmethod
    def index_style_func(node):
        return html_style.td_style(
            text_align="center",
            background_color="#4F81BD",
            color="#FFFFFF",
            font_weight="bold",
            white_space="pre",
            padding="10px",
            border=1,
        )

    @classmethod
    def render_html(cls):

        # Prepare first data frame (same as in render_xlsx)
        data_df = load_names_data()
        data_df = data_df[data_df["year"] >= 2000]
        g = data_df.groupby(("year", "gender"))
        df = g.max()

        klass_ = cls
        pm = tc.build_presentation_model(
            df=df,
            output_format="html",
            data_value_func=klass_.data_value_func(df),
            data_style_func=klass_.data_style_func(df),
            header_value_func=klass_.header_value_func,
            header_style_func=klass_.header_style_func,
            index_style_func=klass_.index_style_func,
            index_value_func=klass_.index_value_func,
            index_name_func=klass_.index_name_value_func,
        )

        layout = [pm]
        # render to xlsx
        html = htmlw.HTMLWriter.to_html(layout, border=1)
        output_fp = os.path.join(tempfile.gettempdir(), "example4.html")
        print("Writing to =", output_fp)
        with open(output_fp, "w") as f:
            f.write(html)


# end_HTMLExample4


def main():
    HTMLExample1.render_html()
    HTMLExample2.render_html()
    HTMLExample3.render_html()
    HTMLExample4.render_html()


if __name__ == "__main__":
    main()

Indices and tables