CareerZen Logo
Company logo

IT Support Manager

Chimes International LTD

Full-time | Contract

Baltimore, MD

Job description

Overview
The Data Processing Site Manager is responsible for overseeing the daily operations and performance of a large-scale data processing facility. This role ensures that all systems, infrastructure, and personnel operate efficiently to achieve uptime, throughput, and performance goals. The manager coordinates between technical, administrative, and operational teams to maintain optimal site productivity, safety, and compliance.

This is a hands-on technical role, not a corporate management position.

Our facility is a Tier-0 data processing site (smaller than a full-scale enterprise data center), similar to an industrial hosting or compute facility. It requires someone who wants to physically operate, maintain, troubleshoot, and optimize the site’s equipment and infrastructure.

We will train you on our specific computer hardware, but certain core abilities cannot be taught: mechanical sense, basic electrical understanding, troubleshooting instincts, and the ability to work independently.

This job is for someone who enjoys solving problems, working with their hands, and taking ownership of a technical environment that must maintain uptime around the clock.

Who This Job Is For — and Who It’s Not For

We Are Looking For Someone Who:

- Likes being hands-on: fixing machines, swapping hardware, running cables, inspecting pumps and panels.

- Takes initiative without needing step-by-step directions.

- Can read basic schematics, wiring diagrams, or layout drawings.

- Understands fundamentals of:

  • basic electrical work
  • plumbing / pumps / cooling loops
  • troubleshooting mechanical or electronics issues

- Can think through problems logically and solve them independently.

- Wants ownership of site uptime and performance.

This Job is NOT For:

- People who prefer managing others rather than doing the work themselves.

- Corporate data center managers who mostly delegate tasks.

- Anyone whose default mode is “tell me exactly what to do next.”

- Paper managers, clipboard supervisors, or task administrators.

- People who are not comfortable being available when issues arise.

Hands-On Responsibilities and Daily Operations

- Perform physical work on-site: racking equipment, replacing units, running cables, checking cooling loops.

- Monitor power usage, temperature, water flow, Starlink connectivity, and equipment health.

- Respond immediately to alerts, failures, or performance drops.

- Conduct site walk-throughs to identify and resolve issues early.

Maintenance & Repair

- Perform preventive maintenance on electrical, mechanical, and IT systems.

- Inspect and maintain pumps, filters, cooling systems, PDUs, breakers, and network equipment.

- Handle minor repairs independently; coordinate outside contractors for major work only.

Hardware & Infrastructure

- Install and commission new compute units, servers, miners, routers, and cabling.

- Assist with expansions, new containers, rack builds, and infrastructure upgrades.

- Test and validate hardware performance under load.

Documentation & Reporting

- Log power data, performance metrics, cooling trends, and uptime events.

- Keep clear site notes, maintenance logs, and equipment tracking.

- Report major issues to ownership directly—no corporate layers.

Safety & Standards

- Follow proper lockout/tagout and electrical safety procedures.

- Keep work areas clean, organized, and safe.

- Maintain basic compliance with local codes and industrial standards.

Required Skills (Must Already Have These)

We can train our hardware. We cannot train these fundamentals:

- Strong mechanical aptitude

- Basic electrical understanding (breakers, panels, PDUs)

- Familiarity with plumbing or cooling loops (pumps, flow, filters)

- Ability to use tools, meters, testers

- Ability to read basic schematics or wiring diagrams

- Ability to troubleshoot independently

Preferred (Trainable or Nice-to-Have) Skills

- Experience in:

  • crypto mining
  • hosting facilities
  • small data centers
  • industrial or manufacturing sites

- Familiarity with:

  • servers and racking
  • Starlink or fiber connectivity
  • network switches and routers
  • HVAC basics or cooling system operation

Workload & Availability Expectations

The site, when tuned correctly, runs smoothly and can be very stable.

However — problems can occur at any time.

This role requires someone who:

- Can be reachable around the clock for urgent issues

- Can respond quickly to protect uptime

- Is OK with some days being calm and others being hands-on

- Values being the single point of reliability for the facility

Analytical & Strategic Skills

- Root cause analysis for recurring issues

- Finding weak points in the site and improving reliability

- Tracking data and uptime metrics

- Planning upgrades and expansions

- Continuous improvement mindset

Analytical & Strategic Skills

  • Root Cause Analysis – Using data and metrics to identify trends and prevent recurring operational issues.
  • Continuous Improvement – Developing and implementing process enhancements to improve site reliability and efficiency.
  • Data Analysis & Reporting – Interpreting operational data to inform decisions and strategic planning.
  • Project Management – Coordinating new site builds, expansions, or upgrades within timelines and budgets.

APPLICANTS THAT DO NOT ANSWER ALL OF THE PRE-SCREENING QUESTIONS WILL NOT BE CONSIDERED.

Job Types: Full-time, Contract

Pay: $75,000.00 - $85,000.00 per year

Benefits:

  • Dental insurance
  • Health insurance
  • Paid time off
  • Vision insurance

Application Question(s):

  • Can you describe your experience managing daily operations in a data center or similar technical environment?

What systems or tools have you used to monitor uptime, power usage, or environmental conditions (e.g., DCIM, SCADA, etc.)?

How familiar are you with power distribution and cooling systems? Can you explain how you’ve managed or optimized them in a prior role?

Describe your experience with preventive maintenance schedules for electrical, mechanical, or IT equipment.

What steps would you take if part of the site experienced an unexpected power or cooling failure?

Have you ever managed large-scale hardware installations (e.g., servers, miners, or containers)? Walk me through your process.

How do you ensure data processing throughput or site performance remains stable during high-load periods?

  • Describe your experience managing daily operations in a data center or similar environment.
  • What systems or tools have you used for uptime, power, or environmental monitoring? (DCIM, SCADA, etc.)
  • Explain your familiarity with power distribution and cooling systems.
  • Describe your preventive maintenance experience.
  • What would you do during a sudden power or cooling failure?

Work Location: In person