scanny / python-pptx

Create Open XML PowerPoint documents in Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to extract CategoryAxis class names from a chart in a PowerPoint?

Number18-tong opened this issue · comments

Thanks for your remarkable work!
I try to get CategoryAxis class names and series names and values by python-pptx0.6.22, but I can only get the text of series names and values, can not find a way to extract CategoryAxis class names. How to get CategoryAxis class names from a chart in a PowerPoint?

I'm not familiar with the term "CategoryAxis class name". Can you say more?

企业微信截图_17096040376036
Sorry, I may not be able to express myself clearly. The red box shown in the above picture is the "CategoryAxis class name" which
I cannot extract from a chart.

Thanks for your reply.
I read the documentation and try to find a way to extract the text of tick_labels of a chart, but still can not achieve. Is the text of tick_labels can only be set or be changed and not be able to read?

Post the code you tried and I'll take a look.

Post the code you tried and I'll take a look.

from pptx import Presentation
import json

def get_chart_data(chart):
    charttext = ""

    ## try to get the text of category_axis.tick_labels but can not achieve (x axis labels)
    # chartxaxislabels = ""
    # tick_labels = chart.category_axis.tick_labels
    # chartxaxis =

    for series in chart.series:
        charttext += "|\t" + series.name + "\t|"
        charttext += '\t|'.join([str(value) for value in series.values]) + "\t|\n"  #一列数据
    return {'chart title': chart.has_title and chart.chart_title.text_frame.text or '',
            'chart': charttext}


def extract_elements_info(ppt_file):
    presentation = Presentation(ppt_file)
    for i, slide in enumerate(presentation.slides):
        for baseshape in slide.shapes:
            chart = baseshape.chart
            text = get_chart_data(chart)
    return text


ppt_file_path = r"F:\temtestres\pdf\eval\ppttest/test.pptx"
res = extract_elements_info(ppt_file_path)
print(res)
with open(r'F:\temtestres\pdf\eval\ppttest/test.json', 'w', encoding="utf-8") as file:
    json.dump(res, file, indent=2, ensure_ascii=False)
    # json.dump(res, file, indent=2)

print("Done")

The goal of the above code is to input a PowerPoint with a chart and output the text Markdown of the table corresponding to the chart, but the code can only get the red box part in the following image. I try to get the x axis labels by the chart.category_axis.tick_labels, but there is no text information. Is there a way to get the the text of x axis labels?
企业微信截图_17097082487288

Ok, you're looking for Plot.categories. There can be more than one plot in a single chart, like a line chart overlaid on a bar chart and each can have different categories.

Try chart.plots[0].categories.

It does work now, Thanks very much!!!!!