TROC Engineer - Monitoring

Infosys

Infosys

Santiago, Santiago Metropolitan Region, Chile · Chile
Posted on Aug 29, 2025

Job details

Work Location

Santiago


State / Region / Province

Santiago Province


Country

Chile


Domain

Delivery


Interest Group

Infy Chile


Skills

Technology|Infrastructure- Administration,Monitoring|Infrastructure- Administration,Monitoring - ALL


Company

IL Chile


Requisition ID

138339BR


Job description

Position: Monitoring Engineer
Job Description:

The Monitoring Engineer will have the responsibility of designing, implementing, and maintaining systems to proactively monitor the health and performance of the customer's organization's OT infrastructure, including servers, networks, applications, and databases, identifying potential issues and alerting relevant teams to take corrective action before major outages occur; It will use specialized monitoring tools and platforms to analyze data and create reports on the performance and reliability of the system provided by the customer.

Its main functions will correspond to:

  • Design and implementation of the monitoring system:
    • Select and configure monitoring tools based on the needs of the infrastructure under monitoring
    • Perform management, optimization and continuous improvement of client’s monitoring and observability tool
    • Ensure that L1 and L2 support teams, on-site support personnel and 24x7 monitoring team can effectively monitor the technology infrastructure under scope of service
    • Design, implement and maintain advanced configurations on monitoring platforms, ensuring consistency with configuration databases.
    • Develop dashboards and visualizations to present key performance indicators (KPIs).
    • Establish the monitoring requirements for the transition to production of technological projects.
    • Define and manage critical and customized alerts, ensuring they are relevant to monitored services.
    • Set alert thresholds and notification protocols for critical events.
    • Design, update and deploy service maps and management panels of the elements that are within the scope of service adapted to the needs of stakeholders.
    • Collaborate in the implementation of new monitoring technologies, platform integration, migrations and decommissioning of obsolete tools.
    • Analyse and project the capacity of monitoring platforms to ensure they adapt to future growth requirements.
    • Assist in the integration of monitoring systems with incident management tools to escalate issues in a timely manner.
    • Research, diagnose and propose solutions for devices or services that present difficulties to be monitored effectively.
    • Evaluate emerging tools, methods, and configurations that can facilitate the monitoring of devices with technical constraints or complexities.
    • Implement regular training programs for the monitoring team, ensuring that they are aware of the latest technologies, tools, and best practices in monitoring and management of technological infrastructure. This includes training sessions, documentation, and workshops.
  • Performance analysis:
    • Meet contractually agreed SLAs and service KPIs and conduct monthly review
    • Present infrastructure status in scope and its status through monthly reports
    • Perform deep analysis of metrics and events to identify critical patterns and trends that may impact the performance and stability of services.
    • Analyze real-time and historical monitoring data to identify bottlenecks and performance trends.
    • Investigate performance degradations, errors, and system anomalies to identify root causes.
    • Generate reports on system health and performance metrics to inform capacity management planning and optimization.
  • Incident response and troubleshooting:
    • Enable the rapid and accurate identification of relevant events to minimise the impact on operations, using client’s monitoring tools.
    • Respond to alerts and proactively troubleshoot issues to minimize downtime.
    • Use advanced methodologies and technologies to anticipate and mitigate potential incidents, adapting to BHP's operational needs.
    • Manage requirements and incidents related to the monitored platforms, providing effective and timely solutions.
    • Generate detailed reports on the problems identified and the solutions implemented, providing feedback to prevent similar situations in the future.
    • Collaborate with other teams to identify and resolve system issues.
    • Conduct post-incident analysis to identify areas for improvement and implement preventative measures.
  • Maintenance and optimization:
    • Participate in meetings to establish priorities and service planning with the client.
    • Monitor the lifecycle of monitoring tools, ensuring their continuous updating and alignment with market best practices.
    • Regularly review and update monitoring configurations to reflect changes in infrastructure.
    • Manage users and privileges of the platforms, perform periodic audits and generate reports.
    • Maintain an updated inventory of entities on each monitoring platform and ensure that they are in sync with the CMDB.
    • Provide robust and adaptable support for monitoring and observability tools and platforms.
    • Implement constant improvements to optimize the performance and effectiveness of monitoring tools.
    • Perform system updates and patch management for monitoring tools.
    • Collaborate with technical teams in the planning and execution of necessary adjustments in configurations, networks or devices to ensure their integration into monitoring platforms.
    • Optimize monitoring processes to improve efficiency and reduce false positives.



The characteristics of the candidates are:
  • Information technology professional with at least a bachelor's degree or higher
  • Experience of at least 5 years in large IT infrastructure monitoring roles
  • Strong understanding of IT infrastructure components such as servers, networks, databases, and applications.
  • Experience with SolarWinds monitoring platform is mandatory. Desirable knowledge of Graphana and IMS
  • Ability to interpret complex monitoring data and identify patterns.
  • Excellent troubleshooting skills to diagnose and resolve operational environment issues.
  • Knowledge of scripting languages (e.g., Python, PowerShell) for custom automation and monitoring tasks will be desirable.
  • Effective communication to collaborate with cross-functional teams and report findings clearly.



About Us
Infosys is a global leader in next-generation digital services and consulting. We enable clients in more than 50 countries to navigate their digital transformation. With over four decades of experience in managing the systems and workings of global enterprises, we expertly steer our clients through their digital journey. We do it by enabling the enterprise with an AI-powered core that helps prioritize the execution of change. We also empower the business with agile digital at scale to deliver unprecedented levels of performance and customer delight. Our always-on learning agenda drives their continuous improvement through building and transferring digital skills, expertise, and ideas from our innovation ecosystem.

EEO
Infosys provides equal employment opportunities to applicants and employees without regard to race; color; sex; gender identity; sexual orientation; religious practices and observances; national origin; pregnancy, childbirth, or related medical conditions; or disability.