Big Data and Analytics

Get a Quote

DSR Big Data is a division of DSR Corporation, specializing in the way a company can store, process and analyze large amounts of data.

Our team can help you during every stage of your big data project, starting from project requirements analysis and creation of the architecture, and continuing with the project development and maintenance. Our experience allows us to choose the best combination of technology platforms, so the final solution is not only reliable and efficient, but also not tied to a particular vendor, thus allowing you to maximize the budget and optimize the result.

Frameworks

Hadoop, Hadoop Yarn
Spark, Spark Mlib, GraphX, Spark SQL
Hive
Kafka
Apache Storm
Scikit
Weka
Matlab
RapidMiner

NoSQL DBs and Languages

Hbase
Cassandra
MongoDB
Scala
Java
Python
R programming language
C++

Models

Linear regression
Logistic regression
Support Vector Machines
Random forest
SVD
Neural Networks

Example DSR Analytics Projects

Development of a large BI analytical system on top of the company’s distributed 20-terabyte data storage. The peculiarity of the project was the fact that at first DSR had to migrate the data from SQL to NoSQL DB (in order to satisfy system performance requirements), and set up a constant replication process. Platforms used: Cassandra, Spark, Spark SQL, Tableau.

DSR have developed a solution for Western Europe banking companies that allowed its users to quickly access a large amount of information on its potential customers, gathered from free Internet sources. The solution consisted of two stages: create a system that could parse more than 4 terabytes of information from different sources and continuously update that information, and create a way for the users to access that information within fractions of a second. Platforms used: Cassandra, Hbase, Spark, Weka, Hadoop, Hadoop Yarn.

DSR had taken part in creating Fraud Scoring for a large microfinancing company. The task was not only to combine the data from different sources (more than 30) and to calculate a score using a complex algorithm, but to make that process reliable and quick, so the decisions can made on the fly without any data loss. Platforms used: Apache Storm, Cassandra, Hbase, Spark, Spark Mlib, Graph X needs a period after it.

Interested?

Get a Quote