What You’ll Be Doing
Responsible for effective administration, configuration, installation, provisioning, monitoring and troubleshooting to ensure availability, capacity and performance of infrastructure systems is achieved. Perform assigned duties as listed in RACI model(s) and under general direction from manager. Collaborate with peers, architects, project managers, managers and business stakeholders on design, strategy and project activities.
Support SchoolsFirst complex Compute and Storage infrastructure with a focus in one or more key functional areas : Operating Systems, OpenShift Containers, Unix, Linux and Storage
Design, install, and configure systems to support infrastructure and applications.
Develop and maintain configuration standards and operating procedures.
Coordinate with vendors for technical support and upgrades.
Perform intermediate design through independent thinking and application of best practice principles.
Troubleshoot complex issues or problems, including escalation internally or to 3rd party vendor support, to drive root cause analysis
Forecast, recommend, and implement capacity planning.
Manage vendors and hold them accountable to contractual SLA’s and obligations.
Submit and fulfill service requests.
Respond to incidents and problems.
Provide KPI and metrics for reporting.
Perform daily system monitoring, verifying the integrity and availability of network infrastructure.
Conduct after-hours maintenance.
Perform ongoing performance tuning, hardware upgrades, and resource optimization as required.
Gather and analyze system log files.
Research and recommend innovative and automated approaches for system engineering tasks.
Coordinate and collaborate with teammembers and service stakeholders.
Provide assistance for support cases escalated by Systems Administrators and Systems Engineers
Perform peer review of submitted changes by Systems Administrators and Systems Engineers
Provide second tier support at engineer level to including investigation and troubleshooting.
Monitor ITSM tickets and prioritize appropriately.
Design and support disaster recovery and business continuity solutions, failover and testing efforts.
This position will be part of a paid on call rotation supporting the production environment 24x7x365.
Additional Job Functions
Performs other duties as assigned
Complies with regulatory compliance and assigned training requirements including but not limited to BSA regulations corresponding to their specific job duties. Failure to do so may result in disciplinary and other employment related actions
Qualifications
Bachelor's Degree with a technical major, such as engineering or computer science or equivalent years of experience required
10+ years of system administration experience required
Red Hat Certified Administrator preferred
IBM CSE-Virtualization preferred
Knowledge, Skills, and Abilities
Expert knowledge with eleven to twelve years of experience with one of more of the following :
Credit Union specific applications ▪ General back office applications ▪ Dell Data Protection ▪ Solarwinds, Splunk, AppDynamics or other monitoring tools
Expert knowledge with eleven to twelve years of experience with one of more of the following Operating Systems : ▪ Red Hat Linux ▪ IBM AIX ▪ Red Hat OpenShift Container infrastructure ▪ Quay ▪ VMware vCenter and vSphere
Expert knowledge with eleven to twelve years of experience with one of more of the following Compute systems : ▪ IBM P Series ▪ HP Non-stop ▪ Cisco UCS B and C Class ▪ Nutanix
Expert knowledge with eleven to twelve years of experience with one of more of the following Storage systems : ▪ Dell PowerMax ▪ Pure Storage ▪ NetApp FAS / All-Flash ▪ Cisco MDS SAN
Expert knowledge of operating system scripting and utilities.
Expert knowledge of SNMP and log monitoring tools.
Expert knowledge of TCP / IP and OSI Model.
Expert knowledge of firewalls, routers and switches.
Expert knowledge of storage protocols; iSCSI, Fiber Channel and NFS.
Expert knowledge of audit and security best practices (NIST, PCI, ISO).
Expert knowledge of Data Center standards including cabling, fire suppression, power and safety.
Design, implement and manage SAN infrastructure, configure fibre channel switches, manage storage arrays like PowerMax, Netapp and Pure.
Experience with DELL PowerMax including Unisphere for provisioning and management of LUNs.
Handle LUN mapping, zoning, and multipathing for AIX applications.
Create and Manage SRDF replication sets for critical applications.
Solid troubleshooting skills on performance and fabric issues.
Proficient with Cisco MDS SAN director class switches, managing zones including VSAN and NPIV configurations.
Understanding of storage snapshot backups and ability to create and manage snapshots.
Experience with Pure storage is a plus.
Work with storage vendors to upgrade and patch PowerMax and Pure Storage software.
Ransomware experience is a plus.Understanding of RedHat’s Kubernetes-based platform for container orchestration.
Experience with deploying, configuring, managing, and scaling OpenShift clusters.
Knowledge and experience with CI / CD pipelines.
Monitoring using tools like Prometheus and Grafana for troubleshooting cluster issues.
Experience with performing OpenShift cluster rolling upgrades to reduce downtime.
Working knowledge of tools like Docker and Podman.
Securing clusters with RBAC and OAuth.
Ansible Automation Platform
Senior UnixLinux Systems Engineer Senior Systems Engineer • Tustin, CA