Boston-area analytics looking to bring on a few more devs to work on the various short and future projects – some of the details are below (any more detail requires a signed NDA or the possibility of death). Roles are senior to team lead – both have architecture elements; openings are because of life changes for folks who have moved on, for people who have been promoted, and for adding to the teams as a result of both new business and for customers requiring substantially more functionality.
The head of software technology is pretty darn easy-going and is a Luke Starwalker wannabe who keeps a light saber handy (but seems to always be in need of new batteries). The company began as a startup, grew, made money, yet retains some startup feel to it (just like adults have the knack for acting like goofy kids when needed). You don’t have to be in the office all the time and if you wear business casual you will likely be met with multiple Golden Retriever head-tilts.
So read on – and LMK what resonates with you (my email is at the bottom). No résumé is needed because we’re just talking…
Details About The Pipeline
- Pipeline processing utilizes Torque (batch server) and Hadoop
- Data needed for processing are stored in MySQL and Vertica databases
- Vertica is used for generating reports used by Data Quality and Data Acquisition; pipeline jobs are orchestrated by a proprietary interface built on top of Luigi; ad-hoc access to pipeline files in HDFS is provided via Hive
- Initial prep of pipeline files, such as cleansing and removal of sensitive data is via Python scripts and C++ compiled programs; the bulk of the rest of the processing is implemented as Python scripts and Map/Reduce jobs in Hadoop using streaming interface
The Company’s Development Approach
- Agile and Scrum with 3 weeks Sprints with emphasis on high priority stories (typically 80+% acceptance of stories)
- Rally project management
- No special support bucket
- Scrum and tickets for DevOps
The Software Development Flow
- GIT for source control; repos are hosted internally
- Gerrit for code reviews and merges (developer pushes changes in their topic branch up for review)
- Jenkins for deployment and unit tests (kicks off master merge to Gerrit)
- Pip install for Python packages, internal PiPy server, everything runs in virtualenv
The Software Development Teams
- Data Technology – Data Delivery (Pipeline)
- Distributed processes, batch processing, Map/Reduce jobs, application of Data Science models for inference and weights generation
- Cloud Computing with Hadoop, Torque, MySQL, Vertica
- Python, bash, C++
- Data Technology – Data Mining tools
- Development of the tools for Data Miners, mostly CLI
- Data Extraction using Hadoop, Torque, MySQL, Vertica
- Stronger emphasis on efficiencies and response times
- Data Aggregation and metrics generation, application of the weights
- Python, bash, Java
- General Interfaces Team
- Interface layers on top of Hadoop, Torque, Vertica, Luigi
- Authentication and web services
- Application deployment
- Python, Java, network/system layers
- Syndicated Products Team
- Python Django, Backbone, jQuery and various JavaScript charting libraries
- CSS3, HTML5, LESS
- Data Collection Agent Team
- Browser extensions for IE, Chrome and Firefox, proxy for Windows
- JavaScript, C++, Windows Interfaces, Install, Python for reports generation and download software website
- DevOps
- DevOps
- Production runs of pipeline, backfill, re-runs
- Pro data publish
- Hardware troubleshooting
- Web Servers tuning, Linux in production environment, Windows Servers, IIS, Apache, NFS, SQL, SSH/SCP/SFTP
- IT Labs: PC/Laptop support, HelpDesk Change MGMT
- DevOps
Future Tech Initiatives
- Dynamic Normalization Methodology, Mobile
- Complete overhaul of Pipeline
- Complete overhaul of Data Mining tools
- Redesign of Custom Products: Moving to Dynamic Normalization, simplifying User Experience
- Segment Management System: Allow Data Miners to generate, maintain and query a set of standard data segments with automated daily data pulls
- Next Generation Data Collection
- Local Proxy Windows Service
- Keeping up with Browser evolutions
As promised, here’s my email (just click it to ping me): levy.steve@gmail.com
Thanks. ~Steve