minio / minio

The Object Store for AI Data Infrastructure

Home Page:https://min.io/download

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Activating MINIO_API_SELECT_PARQUET not works

masalinas opened this issue · comments

NOTE

If this case is urgent, please subscribe to Subnet so that our 24/7 support team may help you faster.

I have and active Minio tenant. I activate the environment variable MINIO_API_SELECT_PARQUET from Minio Operator (see the capture) to read parquet files. But when try to parse the file from a python sample minio response:

An error occurred (InternalError) when calling the SelectObjectContent operation (reached max retries: 4): We encountered an internal error, please try again.: cause(parquet format parsing not enabled on server)

Captura de pantalla 2024-05-10 a las 23 12 05

This is the piece of code in python:

import boto3

s3 = boto3.client('s3',
                  endpoint_url='https://localhost:9000',
                  aws_access_key_id='BsvW9jlpYX8TvD9F',
                  aws_secret_access_key='HrGdJapKsXbKEcXABWNQ2CO15v3y9MMk',
                  verify=False,
                  region_name='us-east-1')

r = s3.select_object_content(
    Bucket='uniovi',
    Key='sample.parquet',
    ExpressionType='SQL',
    Expression="select * from s3object",
    InputSerialization={'Parquet': {}},
    OutputSerialization={'CSV': {}},
)

for event in r['Payload']:
    if 'Records' in event:
        records = event['Records']['Payload'].decode('utf-8')
        print(records)
    elif 'Stats' in event:
        statsDetails = event['Stats']['Details']
        print("Stats details bytesScanned: ")
        print(statsDetails['BytesScanned'])
        print("Stats details bytesProcessed: ")
        print(statsDetails['BytesProcessed'])

Expected Behavior

Read parquet files

Current Behavior

I can not read parquet files

Your Environment

  • Version used (minio --version): Minio Operator 5.0.12
  • Server setup and configuration: minikube 0.041
  • Operating System and version (uname -a): Mac sonoma
  • Python Boto3 dependency