FREE Webinar, April 8th 12 ET : Archival Storage: The OU & Regional Research Store (OURRstore)

(click title for details)
FREE WEBINAR – Wed Apr 8 noonET/11amCT/10amMT/9amPT/8amAT/6amHT

2019-12-13 /projects2 filesystems maintenance on January 14


/projects2 on the Pegasus cluster will be offline from 7AM to 11:59 PM EST, Tuesday, 01/14/2020. During this period, we will be working on the future expansion of our system. The rest of the system will be functioning as usual. Please plan accordingly.

2019-12-09 ACS Documentation updates

Advanced Computing Systems Documentation & User Guides are now displayed through ReadTheDocs :

2019-02-05 CG Campus network upgrades complete

CCS network upgrades have completed and access has been restored.


2017-02-17 CCS Advanced Computing – Slack community

CCS Advanced Computing invites you to connect with the Advanced Computing community on Slack:
The Advanced Computing community Slack channels provide a place for user discussions, information sharing, and informal announcements about CCS resources and developments. All users with an or email address can create a Slack account in UM Advanced Computing.

Big Data / Analytics

The latest addition to the Advanced Computing Services platform is the Big Data/Analytics core. The Big Data/Analytics core is focused on providing not just the infrastructure for Big Data programs but also the expertise in Analytics and Machine Learning to address real world problems. The Big Data Analytics Cluster (BDAC) service has over 256 cores and close to 75TB of big data storage. This storage utilizes several ETL platforms (including Kettle) to stage data from the W.A.D.E. storage cloud onto BDAC.

BDAC is designed around HDFS and Map-Reduce frameworks, presented by industry standard tools such as the NO-SQL database technology HBASE, Sqoop, Flume, and Spark technologies, and is heavily invested in Python and R bindings. PySpark, SparkR, and other intelligence engines (machine learning focus) are incorporated into BDAC providing UM researchers the most flexible Big Data system possible.

Along with infrastructure, the Big Data/Analytics core also features subject matter expertise in Analytics for all data sets. Focused on non-traditional analysis techniques and Machine Learning, core staff are currently working on projects ranging from sentiment analysis with genetic algorithms to medical billing informatics and the URIDE data analytics platform.

The Big Data/Analytics offers a range of specialized consulting services to UM’s research community in large scale data collection and storage, data processing pipeline development, data mining and machine learning, big data search, and presentation layer development. We aim to improve and optimize business processes through data-driven decision making.

Big Data/Analytics core projects:

  • HPC cluster status and job statistics data collection and processing for performance, utilization, and efficiency research (dataset of more than 10 million records)
  • hadoop cluster creation and management automation in OpenStack Cloud, and high-availability development
  • Pentaho Kettle hadoop cluster integration
  • collaboration with the Business School in a text mining analysis for Amazon product reviews to build a model that could improve future sales
  • UHealth diagnosis and patient demographic data analysis to discover patterns among different elements of their clinic datasets (dataset of more than 5 million records)
  • analysis of hospital EDI dataset to improve the accuracy of insurance claims