Data Engineer (LOCAL)

Esolvit
Washington, DC, United States
Full-time
We are sorry. The job offer you are looking for is no longer available.

Title : Data Engineer (LOCAL)

Location : Washington, DC

Duration : Long Term

You may either create a user id and sign up under the arytic link or go in without signing up to access the job / open roles on our Arytic platform.

Job Description :

The Data Engineer provides the ETL support to the data science and software engineering team members. Build, modify, support infrastructure for optimal extraction, transformation, and loading of data from variety of structure, unstructured data sources and multi-terabyte distributed file system.

Candidate will formulate and rapidly prototype various approaches as well as effectively communicate the pros and cons of each.

Provide data-driven approaches to tackle various business problems. The candidate will have the ability to contribute to a high-performing, motivated workgroup by applying interpersonal and collaboration skills to achieve project goals Architect for ML data pipeline with data acquisition and preprocessing functionalities that gather data from heterogenous data pool from the distributed file system, unstructured text extracted from multi-million images of medical records with varied OCR quality, their metadata from relational databases and custom annotations.

Responsibilities :

  • Provide current system architecture documentation, engineering / web development programming support for program / project requirements defined tasks, data science / data engineering related technical assessments
  • Manage / maintain structured, semi-structured, and unstructured data, structuring and wrangling data as appropriate for statistical analysis
  • Implement data warehouse concepts and relational databases, big data management techniques and tools (e.g. Hadoop, MAPReduce)
  • Communicate with technical and non-technical users and managers, and server administration, to include hardware and software support to existing servers.
  • Provide software engineering support to operate, maintain and enhance systems that are integrated with and / or relied upon by the data engineering lifecycle
  • Integrate, analyze, and visualize data and information in near real-time (within 24 hours) from multiple disparate data sources.
  • Optimize data storage and access
  • Proficiency with Python and Java, Oracle enterprise manager, SQL, AWS

Qualifications :

  • Masters degree in related field + 5 years experience; or PhD +1 year experience; or Bachelor's degree in related field + 7 years experience
  • Minimum of 5 years experience conducting ETL tasks, performance engineering, run-time optimization, large data volume transfers
  • Minimum 3 years experience with Regular Expressions, SQL (PostgreSQL), No-SQL (MongoDB)
  • Minimum 1 year experience with Version control systems (Git)
  • Preference to developer with experience working with healthcare data and Health IT

Skills / Tools Utilized (at least 1-2 years exp in some of the following) :

  • Apache Hadoop (Cloudera)
  • AWS Data Platforms (Redshift, S3, EMR / Hive)
  • Java
  • Kafka
  • Scala
  • Kotlin
  • Neo4j
  • NiFi
  • Flink
  • Sqoop
  • PostgreSQL
  • Apache Spark
  • Python
  • Oracle
  • Splunk
  • testing framework : Cucumber
  • Knowledge of and experience using various NLP approaches, particularly :
  • Pattern recognition / feature extraction
  • Supervised, Unsupervised, and Semi-Supervised learning techniques
  • Understanding of various language models (N-Gram, Skipgram, NLM, etc.)
  • Chunking / Tokenization
  • Semantic parsing

Skills highly desired :

  • Healthcare IT experience
  • Statistical model building (particularly classification)

Required skills :

  • 10+ years of experience in MongoDB
  • 10+ years of experience in PostgreSQL.
  • 10+ years of experience in Cloud Application Architecture
  • 4 days ago
Related jobs
Promoted
Esolvit
Washington, District of Columbia

The candidate will have the ability to contribute to a high-performing, motivated workgroup by applying interpersonal and collaboration skills to achieve project goals Architect for ML data pipeline with data acquisition and preprocessing functionalities that gather data from heterogenous data pool ...

ActioNet
Washington, District of Columbia
Remote

Apply knowledge and experience with data storage technologies, including relational databases, NoSQL databases, and big data platforms. Experience with data extraction, translation, and loading including data prep and labeling to enable data analytics. Leverage knowledge and experience with metadata...

Promoted
ManTech
Washington, District of Columbia

The Software Developer Lead IV is a SME with. They share work and collaborate with other Soft Developer Lead IVs. Experience leading a software development. Master's degree in Software Development. ...

Promoted
Accenture Federal Services
Washington, District of Columbia

Experience with database script writing, database storage management, database interfaces, external application database interfaces, and/or other systems database interfacing. You are an administrator, a developer, and a tester of all things DATABASE!  You can swiftly perform many related database f...

Promoted
InsideHigherEd
Washington, District of Columbia

Project Manager, Energy and Sustainability - Planning and Facilities Management - Georgetown University. Project Manager, Energy and Sustainability -. Project Manager, Energy and Sustainability ​. Provide direction and coordination of project teams composed of non-facilities stakeholders, Georgetown...

Promoted
Ivy Exec
Washington, District of Columbia

CEO, COO, CMO, CFO, CTO, SVP IT, VP IT, SVP Sales, VP Sales, SVP Marketing, VP Marketing, SVP Ops, VP Ops, SVP HR, VP HR, SVP Finance, VP Finance, IT Leader, IT Manager, IT Director, Senior IT Manager, IT Administrator, IT Project Manager, IT Supervisor, Software Development Executive, Software Deve...

Promoted
VirtualVocations
Washington, District of Columbia

A company is looking for a QA Engineer- Core Database (Remote). ...

Promoted
AssistRx
Washington, District of Columbia
Remote

Drive the execution and delivery of features by collaborating with many cross functional teams, architects, product owners, and developers. Working knowledge of unit testing, user stories or use cases, design patterns or equivalent experience, and object oriented software design. ...

Promoted
Lumen Solutions Group Inc.
Washington, District of Columbia

Development and maintenance of all required project documents including 1) A Project Charter 2) A Project Management Plan 3) Requirements documents 4) Design documents 5) Input to architecture documents 6) Test Plan 7) Test results 8) Implementation Plans 9) Post Implementation and Lessons Learned 1...

Promoted
Burtch Works
Washington, District of Columbia

Lead meetings with clinical reviewers and statistical reviewers to present results from data quality assessments and standard safety data analyses. Work with FDA stakeholders to review background packages and mock safety datasets to assess appropriateness of controlled terminology and safety dataset...