-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need Support for Dynamic CSV Headers in filelog Receiver. #36415
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
To me, it sounds like a valid enhancement request. But I'm quite unsure how to accomplish this. |
This is an enhancement, not a bug. Please let me know if you disagree |
I’m not entirely sure if this qualifies as an enhancement or a bug. At the very least, they should not accept the acceptance of regular expressions for file names in case of csv-parser. Please share your thoughts. |
I see what you mean. But I'm not very supportive of that idea. It seems like a strange case to me. I'll explore the codebase and see what we can do here. In the meantime, I'll ask @djaglowski to share his thoughts over this issue. |
Have you tried using |
@djaglowski I think the user is looking for following use case: Consider two csv files, file1.csv and file2.csv: file1.csv:
file2.csv:
If the user uses following config, he/she may want the parser to automatically detect the headers for different csv files. receivers:
filelog/LightningInteractionLogs_multiple:
include: [file*.csv]
start_at: beginning
operators:
- type: csv_parser
exporters:
logging:
loglevel: debug
service:
pipelines:
logs:
receivers: [filelog/LightningInteractionLogs_multiple]
exporters: [logging] For all the logs emitted from |
Dan, Is it possible to use EDIT: closed by mistake. |
Found an older related issue #10275 |
Thanks for clarifying @VihasMakwana. I think it might be possible to accomplish this using the The general idea would be that you configure the |
I can take this up. |
Component(s)
receiver/filelog
What happened?
Description
Description:
I am using the filelog receiver in the OpenTelemetry Collector Contrib to parse CSV log files. When parsing a single file with a predefined header, the configuration works as expected. However, when attempting to process multiple CSV files with different headers, there is no way to dynamically handle varying headers.
If the header is omitted, the configuration fails with an error. This limitation makes it impossible to manage directories containing multiple CSV files with different structures efficiently.
Steps to Reproduce
Steps to Reproduce :
Configure the filelog receiver to parse a single CSV file with a specified header
receivers:
filelog/LightningInteractionLogs_quoted:
include: [/u01/SFLogs/8292024/continuationcallout_hundred.csv]
start_at: beginning
operators:
header: ApplicationName, page_app_name, Application_Version, Environment, HostName, EventType, timestamp, user_id, user_name, url, duration, request_form_size, response_size, status_code, success, TimestampDerived
Attempt to configure the receiver to include multiple CSV files with varying headers:
receivers:
filelog/LightningInteractionLogs_multiple:
include: [/u01/SFLogs/*.csv]
start_at: beginning
operators:
No way to handle multiple headers dynamically
Observe the failure when the header is not explicitly provided:
Error: failed to build pipelines: failed to create "filelog/LightningInteractionLogs_multiple" receiver for data type "logs"; missing required field "header" or "header_attribute"
Expected Result
Expected Result :
The csv_parser operator should be able to:
Dynamically detect headers from the first row of the CSV file (e.g., via a dynamic_header option).
Alternatively, allow mapping specific headers to specific files or file patterns using a header_attribute or similar configuration.
For example:
receivers:
filelog/LightningInteractionLogs_dynamic:
include: [/u01/SFLogs/*.csv]
start_at: beginning
operators:
- type: csv_parser
dynamic_header: true
Actual Result
Actual Result
The configuration fails when header is not explicitly provided, making it impossible to process multiple CSV files with different headers in the same receiver configuration.
Error message:
Error: failed to build pipelines: failed to create "filelog/LightningInteractionLogs_multiple" receiver for data type "logs"; missing required field "header" or "header_attribute"
Collector version
v0.109.0
Environment information
Environment
OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")
OpenTelemetry Collector configuration
Log output
Additional context
No response
The text was updated successfully, but these errors were encountered: