Enhance the ETL pipeline for Driver Trace Stats
This merge request aims to simplify the existing ETL pipeline for Driver Trace Stats by utilizing OOP to group functions into methods and share data more effectively. Additionally, the pipeline will use new python-gitlab
and influxdb
libraries, updating its wrappers and clients.
Some of the significant changes in this MR include:
- Created the method in gitlab wrapper
main_pipelines_from_mrs
to fetch all merge requests pipelines that resulted in a merge into the main branch. - Setting the timings of performance data to the pipeline's
finished_at
instead ofupdated_at
to avoid skewing data. - Adding the feature to replay missing performance jobs through
python-fire
. - Hardcoding the last write parameter using the
INFLUXDB_LAST_WRITE
environment variable. - Adding more customization options for driver perf data extraction, including the ability to fetch data from specific branches of mesa forks.
- Using iterator=True instead of manual pagination in the gitlab-wrapper.
Overall, these changes will help make the ETL pipeline more efficient, accurate, and customizable.