Job Role: Observability SME
Location: Boston, Massachusetts Hybrid – 3 days a week (Tue-Thu)
Key Responsibilities:
- Lead the strategy and implementation of synthetic monitoring solutions (e.g., uptime checks, transaction simulations, endpoint testing).
- Manage and optimize infrastructure log collection, parsing, storage, and analysis across servers, network devices, and cloud platforms.
- Set up and fine-tune monitoring tools to detect anomalies, bottlenecks, and performance degradation proactively.
- Integrate observability solutions with alerting, incident management, and automation workflows.
- Create and maintain dashboards, alerts, and log queries for operational visibility.
- Collaborate with infrastructure, application, and DevOps teams to ensure end-to-end observability.
- Conduct root cause analysis using synthetic and log-based signals.
- Define observability standards and best practices for log retention, normalization, and alert thresholds.
- Provide guidance and training on effective use of synthetic and logging tools.
Required Skills & Experience:
- 5–10 years of experience in IT operations or SRE roles with a strong focus on monitoring and logging.
- Hands-on expertise with synthetic monitoring tools such as Pingdom, Thousand Eyes, Datadog Synthetic, Uptrends, Catchpoint, or New Relic Synthetics.
- Strong knowledge of log management tools like ELK Stack (Elasticsearch, Logstash, Kibana), Fluentd, Graylog, Splunk, or Loki.
- Experience integrating synthetic checks with alerting platforms (e.g., PagerDuty, Opsgenie).
- Ability to write log queries and alerts to detect anomalies and automate incident response.
- Familiarity with infrastructure (Linux/Windows servers, network devices) and cloud environments (AWS, Azure, GCP).
- Understanding of SLAs, SLOs, SLIs, and monitoring KPIs.
- Experience with scripting (Python, Shell, PowerShell) for log parsing and synthetic test creation.
Preferred Qualifications:
- Certifications in relevant observability platforms (e.g., Splunk, Datadog, AWS CloudWatch).
- Experience with Open Telemetry or other telemetry standards.
- Exposure to containerized environments (Docker, Kubernetes) and their observability needs.
- Understanding of security monitoring or compliance-related logging (SIEM concepts).
Govind Awasthi
Associate – Talent Acquisition
KAnand Corporation
Phone: 512-572-2762
Office: 512-697-9762 Ext 419
Web: www.kanandcorp.com
### CAUTION – Disclaimer ##
This e-mail contains Privileged and Confidential Information intended solely for use of the recipient (s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly prohibited and may be unlawful. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. KAnand reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the KAnand’s e-mail system.
##KANAND## End of Disclaimer ## KANAND##
—
You received this message because you are subscribed to the Google Groups “sys1point” group.
To unsubscribe from this group and stop receiving emails from it, send an email to sys1point+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/sys1point/CACTz-TvFGixwrc-LYL6n0%3DChNGHPfrO%2B%3D_pjBb7sqVo%3DyqvJzA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.