2019-09-06 CCS services restored

As of 12:00 P.M. All CCS storage and systems are operational again.  Thank you for your patience.

2019-04-10 Projects filesystem expansion complete

The expansion and maintenance of the /projects2 filesystem has been completed, and all access has been restored.

CCS AC
hpc@ccs.miami.edu

2019-02-05 CG Campus network upgrades complete

CCS network upgrades have completed and access has been restored.

CCS AC
hpc@ccs.miami.edu

2018-10-13 Network maintenance complete

CCS network maintenance has completed and access has been restored.

CCS AC
hpc@ccs.miami.edu

2018-09-21 File system access restored

/scratch and /projects file system access has been restored.

CCS AC
hpc@ccs.miami.edu

Big Data / Analytics

The latest addition to the Advanced Computing platform is the Big Data/Analytics core. The Big Data/Analytics core is focused on providing not just the infrastructure for Big Data programs but also the expertise in Analytics and Machine Learning to address real world problems. The Big Data Analytics Cluster (BDAC) service has over 256 cores and close to 75TB of big data storage. This storage utilizes several ETL platforms (including Kettle) to stage data from the W.A.D.E. storage cloud onto BDAC.

BDAC is designed around HDFS and Map-Reduce frameworks, presented by industry standard tools such as the NO-SQL database technology HBASE, Sqoop, Flume, and Spark technologies, and is heavily invested in Python and R bindings. PySpark, SparkR, and other intelligence engines (machine learning focus) are incorporated into BDAC providing UM researchers the most flexible Big Data system possible.

Along with infrastructure, the Big Data/Analytics core also features subject matter expertise in Analytics for all data sets. Focused on non-traditional analysis techniques and Machine Learning, core staff are currently working on projects ranging from sentiment analysis with genetic algorithms to medical billing informatics and the URIDE data analytics platform.

The Big Data/Analytics offers a range of specialized consulting services to UM’s research community in large scale data collection and storage, data processing pipeline development, data mining and machine learning, big data search, and presentation layer development. We aim to improve and optimize business processes through data-driven decision making.

Big Data/Analytics core projects:

  • HPC cluster status and job statistics data collection and processing for performance, utilization, and efficiency research (dataset of more than 10 million records)
  • hadoop cluster creation and management automation in OpenStack Cloud, and high-availability development
  • Pentaho Kettle hadoop cluster integration
  • collaboration with the Business School in a text mining analysis for Amazon product reviews to build a model that could improve future sales
  • UHealth diagnosis and patient demographic data analysis to discover patterns among different elements of their clinic datasets (dataset of more than 5 million records)
  • analysis of hospital EDI dataset to improve the accuracy of insurance claims