Summary
Overview
Work History
Education
Skills
Personal Advantages
Accomplishments
Experience Years
Intention
Expected City
Personal Information
Timeline
Generic
Sun Tao

Sun Tao

Senior operation and maintenance engineer
Shanghai

Summary

1. With rich experience in the operation and maintenance management of the automobile industry, he is proficient in the operation and maintenance process of various large automobile enterprises.

2. Deeply understand the cooperation mode of C-end customer demand and B-end enterprises, and effectively promote the coordinated development of business.

3. Continue to focus on the forefront of operation and maintenance technology, and can quickly absorb and apply new technologies to improve service efficiency.

4. Develop and implement customized operation and maintenance policies according to business characteristics to ensure business continuity and stability.

5. Excellent teamwork ability to play a key role in cross-departmental projects.

6. Through data analysis and problem solving, optimize the operation and maintenance process, and improve the overall business performance.

7. Strong leadership and project management skills, able to lead the team to achieve business goals.

Overview

9
9
years of professional experience
5
5
years of post-secondary education

Work History

Senior DevOps Engineer

Shanghai Jidu Automobile Co., Ltd.
Shanghai
08.2021 - 12.2024
  • Took the lead in the IT business operation and maintenance team, covering aspects such as organizational structure design, capability enhancement, and daily operation management. Ensured that the team efficiently supported the company's diverse IT business needs.
  • Spearheaded the planning and construction of operation and maintenance infrastructure, which encompassed public cloud services (Baidu Cloud), Kubernetes (K8S) environment, VMware private cloud, and Cloudstack solutions. This ensured business continuity and data security.
  • Took charge of building the operation and maintenance monitoring and logging systems, leveraging the Prometheus + VictoriaMetrics + Grafana technology stack. Ensured comprehensive monitoring of the K8S environment, Linux/Windows/GPU devices, EMC/Ceph storage, and network devices. Timely detected fault risks and enhanced the efficiency of problem response and resolution.
  • Implemented change management, capacity planning, and problem-handling procedures. Optimized the quality of operation and maintenance services, reduced system risks, and improved business stability.
  • Managed and optimized operation and maintenance costs, maximizing cost-effectiveness through resource allocation and technological innovation.
  • Facilitated the digital transformation of enterprise operations. Supported the operation and maintenance of key business systems like vehicle R&D PLM and vehicle simulation HPC, promoting the automation and intelligence of business processes.

Senior DevOps Engineer

Shanghai Xiandou Intelligent Robot Co., Ltd.
06.2020 - 08.2021
  • Was responsible for the operation and maintenance of the company's connected vehicle TSP platform, CPSP business, and message push business. Based on the company's public cloud Kubernetes platform, provided operation and maintenance services including capacity planning, service monitoring, external release, and change release for the business.
  • Managed the company's Tencent Cloud and Alibaba Cloud platforms and offered guidance to developers on the use of cloud products.
  • Handled problem location and personnel coordination between the TSP platform and the TBOX end/vehicle end/mobile end during the pre-SOP commissioning of multiple vehicle models.
  • Participated in the review of the new business launch plan and provided optimization suggestions.
  • Took care of the operation and maintenance of the company's Kubernetes cluster and troubleshooting.

DevOps Engineer

Aiways Auto Co., Ltd.
08.2018 - 06.2020
  • Took on the operation and maintenance of Aiways Auto's connected vehicle business. The connected vehicle system was built on Alibaba Cloud in China. Responsibilities included the deployment of the TSP platform, all network planning, and continuous deployment and operation and maintenance work. This system encompassed TSP, APP, operation platform, and new energy monitoring system. Participated in the technical selection and scheme formulation of the MNO (APN) project. Ensured the APN access and operation and maintenance at the TSP end for high-security businesses such as remote control commands of the vehicle TBOX and vehicle condition data reporting. Implemented and maintained the vehicle connection security project plan, which adopted a domestic manufacturer's security solution with HTTPS two-way authentication to safeguard the security of TCP and HTTP traffic in the channel.
  • Managed the C-end operation and maintenance of Aiways Auto. Re-planned the operation and maintenance work from the SRE perspective. Clarified the SLO indicators through communication with business, R&D, and testing. Ensured business availability while meeting rapid iterations. Constructed a reliable continuous deployment program, established a reliable monitoring system, formulated a reasonable capacity plan based on actual business operation, designed a highly available network load balancing scheme, built a log management system, and coordinated with other horizontal departments and promoted projects.
  • Handled the operation and maintenance of overseas (European) connected vehicle business. The overseas connected vehicle system was built on AWS. Work involved collaborating with MSP vendors to quickly promote the deployment of overseas projects and subsequent operation and maintenance. Despite the relatively small overseas business compared to the domestic one, it faced challenges from different cloud vendors, as well as network and compliance issues.
  • Worked on CI/CD, monitoring system, and log system.

DevOps Engineer

Shanghai DragonNet Technology Co., Ltd.
06.2016 - 08.2018
  • Was dispatched to work at the Shanghai Automotive Cloud Computing Center and managed the front-line operation and maintenance team. The team's main responsibilities included IDC computer room hardware inspection, reporting and tracking of alarm events, and on-site emergency response.
  • Formulated a monitoring plan for the virtual machines running on the OpenStack platform in the cloud computing center and selected Zabbix as the main monitoring tool.
  • Took responsibility for the system operation and maintenance of the TSP platform of SAIC's connected vehicle project. Worked with colleagues from external affiliated companies to solve problems encountered by the TSP platform, mainly data center infrastructure failures caused by a large number of vehicles online, such as router, firewall, and Linux system bottlenecks.
  • Led and implemented the introduction of the ELK technology stack, establishing a complete set of log processing tools for the data center, including log data visualization using ES + Grafana. This work was also improved in conjunction with the data center's grading protection assessment.

Achievements:

Education

Undergraduate Degree - Computer Science and Technology

Changzhou University
Changzhou
01.2018 - 01.2020

junior college - marketing

Anhui Post and Telecommunications Vocational and Technical College
01.2011 - 01.2014

Skills

Scripting languages

Monitoring and logging

Infrastructure automation

Configuration management

Personal Advantages

  • Rich experience in operation and maintenance management of the automobile industry.
  • Deep understanding of cooperation modes between C-end customers and B-end enterprises.
  • Ability to quickly absorb and apply new technologies to improve service efficiency.
  • Development and implementation of customized operation and maintenance policies.
  • Excellent teamwork ability in cross-departmental projects.
  • Optimization of operation and maintenance processes through data analysis.
  • Strong leadership and project management skills.

Accomplishments

  • Improved system availability to 99.95% and optimized annual cloud resource cost by more than 80%.
  • Achieved 100% system availability for two consecutive years in the Geely factory.
  • Established operation and maintenance work evaluation system with no shutdown accidents within one year.
  • Established the company's CI/CD tool chain from 0, greatly reducing service unavailable time.

Experience Years

11

Intention

operations engineer

Expected City

Shanghai

Personal Information

  • Age: 31
  • Expected Salary: 35-45K
  • Gender: Male

Timeline

Senior DevOps Engineer

Shanghai Jidu Automobile Co., Ltd.
08.2021 - 12.2024

Senior DevOps Engineer

Shanghai Xiandou Intelligent Robot Co., Ltd.
06.2020 - 08.2021

DevOps Engineer

Aiways Auto Co., Ltd.
08.2018 - 06.2020

Undergraduate Degree - Computer Science and Technology

Changzhou University
01.2018 - 01.2020

DevOps Engineer

Shanghai DragonNet Technology Co., Ltd.
06.2016 - 08.2018

junior college - marketing

Anhui Post and Telecommunications Vocational and Technical College
01.2011 - 01.2014
Sun TaoSenior operation and maintenance engineer