Client:
Loan company from the US
[ Detailed information about the client cannot be disclosed under the provisions of the NDA ]
Project workflow
Challenge
The client’s development team had grown significantly in a short period in different locations. The team used an old Pipeline Automation based on a Python application for the main logic. The application used Jenkins nodes for builds and UI written on node.js for pipeline visualization, diagrams, and schemas
Сlient faced massive delays in releasing new features, bug fixes, and updates because of pipeline issues
Due to the custom nature of the pipeline solution, It was hard to fix the problems in time, so the dev team was periodically blocked
Solution — Preliminary Investigation
Initial investigation showed that there was only one DevOps supporting the infrastructure. After he left the company, the Infrastructure support responsibilities went to the Infrastructure Development Team (IDT). Nobody in IDT knows precisely how to maintain and support the CI/CD tool or fix issues properly
Moreover, due to the Development Team (DT) increase and the distribution of team members, IDT could not handle the amount of support requests from DT. The development process suffered increased idle time and delays in releases
The decision was to create a separate Infrastructure Support Team (3 DevOps engineers) to fix pipeline issues, rework it, and cover 24/7 support of the Development team
Solution — What was done
We developed a plan according to client needs:
- Custom CI/CD tool was reverse-engineered and documented;
- A guide for adding new functionalities to the CI/CD tool has been introduced and documented for future uses;
- Requests automation, Confluence Guideline, and Run books were enriched;
- Pipeline functionality was restored;
- Any team member could fix or add new features to the CI/CD tool due to the introduction of CI/CD documentation;
- Organized the transition of Infrastructure Support to the team of L1 engineers
Results
Increase
team efficiency
Eliminated
release delays
Timings
The project lasted 24 months: 6 months for the initial phase to rework, document and setup everything.
Results achieved:
- Increased development team efficiency: By removing non-development tasks and addressing infrastructure issues promptly, the development team’s efficiency improved.
- Eliminated release delays due to infrastructure issues: The client achieved zero releases postponed because of infrastructure issues.
- The amount of support requests decreased significantly due to requests automation and enrichment of Confluence guidelines and Run Books
Technologies used
The central car control unit projects based on RTOS developed from scratch incapsulates low-level tools:
- QNX OS
- C++
- Rust
- Python
- Jenkins
- Docker
- QA framework
- shell code
- Sonar
- Fortify
- Grafana
- AWS