AWS IoT is a managed cloud platform that lets connected devices easily and securely interact with cloud applications and other devices. AWS IoT can support billions of devices and trillions of messages, and can process and route those messages to AWS endpoints and to other devices reliably and securely.
Topics to Be covered:
.Introduction to IoT
.Implementation using MQTT Protocol
.Different Cloud platform- AWS IoT, Azure IoT
.Smart IoT Enabled Manufacturing Industry
22 Dec 2018 | Fee:Free
Time: 4.00 PM(Onwards)
Contact us: 0120-4646464|9873032127
Despite the insurgence of tremendous competition SAS remains to feature one of the most trusted and most used programming languages as far as advanced analytics and data science are concerned. It is important for us to notice that SAS has been in the market as a leading language for about two decades.
Date: 22nd December 2018(3.00 PM onwards)
We look forward to your participation!!
Mr. Varun is a 8+ years experienced in Statistical Modeling, Machine learning , Visualization and large data handling using SAS, R, Python, Big Data, Hadoop, Tableau and currently working with one of the Big 4, PwC.
Macros in SAS are a powerful way to automate the tasks that you need to perform every day. They make your tasks easier by allowing you to reuse your code multiple times after defining it once. Macros allow you to define dynamic variables in the code that can take different values for different run instances of the same code.
For example, consider, you need to run the sales report every day and display the current date in the title of the code. Without a macro, you need to type the same commands each day to generate the report and provide the current date in the title of the report each day you run the report, as shown in the code extract below:
|proc print data=Sales.Itemsold;
title “Item Sales On Friday 2Feb18
You can automate the same code by using the dynamic variable for the current date and run the same code again each day without changing any line of code, as shown in the code extract below:
|Proc Print Data= Sales.Itemsold;
Title “Item Sales on &SYSDAY &SYSDATE
Global and Local Macro Variables
In the above given code, the SYSDATE and SYSDAY are global macro variables, which show system day and system date. Macros can have global variables and local variables. The global macro variables are system assigned variableswhich can be accessed by multiple SAS programs and local macro variables are accesses only by the programs in which they are defined.
The local macro variables are declared with %LET statement followed by a SAS variable name, which can be assigned a value. You can use the following syntax to declare a local macro variable:
|% LET <Variable Name> = Value;
Macro programs are a set of SAS statements, which need to be run together. You can save these statements with a name. To start a macro program, you need to use %MACRO statement and to end the program, you need to use the %MEND statement.
Big Data has become a new buzz word today. Businesses are actively using real time data of millions ofconsumers to get actionable insightsinto customer’s behavior to provide them the personalized services. Big data help businesses to leverage data to enhance the overall efficiency of their business. This allows businesses to improve the engagement and retentionof their customers with their business.
Big Data can process huge volumes of data that cannot be managed with traditional database technologies. Due to the specific nature of big data, it is stored in a distributed file system architecture, which is easy to scale and effectively process and store massive amount of data.
Hadoop and Big Data
Apache Hadoop holds a primary position in Big Data universe and is considered one of the best frameworks to implement Big Data because it is a fundamental general purpose platform for Big Data. It is open source and allows you to quicklyprocess and store huge amount of data.
It possesses the capabilities of an operating system, data platform and an application platform. It has a file system of its own and has its own fundamental capability to handle applications and control the resources that are used by those applications. In addition, uses YARN (Yet Another Resource Manager), which is a cluster management technology.
Some of the other characteristics, which makes Hadoop the best framework for Big Data implementation are:
- Scalability: Hadoop is highly scalable and can manage any amount of processing requirements. Especially the data coming from social media and next generation devices connected to IoT (Internet of Things), which generates a vast amount of data. Hadoop does not need high end servers with large memory and high processing power. It works on commodity hardware, which is affordable and easy to obtain.
- Parallel Processing: Parallel processing through MapReduce makes Hadoop a very powerful and fast computing platform. This processing technique distributes the processing across multiple nodes and processes data from where it is stored instead of transporting data across network.
- Fault Tolerance: Hadoop is highly fault tolerant. It uses HDFS(Hadoop Distributed File System), which is a distributed file system and always makes 3 copies of the entire file system across 3 separate compute nodeswithin a cluster. Whenever a node goes offline HDFS uses another node to serve the request and to avoid any kind of disruption.
- Flexible: Hadoop is very flexible and allows you to capture and store many different data types including documents, images, and videos and makes them readily available for processing and use. This is a big advantage to businesses because it allows business to draw valuable insights from various different data sources such as social media, email conversions or click stream data.
- Lower Administration: Hadoop does not have high administration and monitoring needs. It works resiliently and does not pose any scaling issues when situation demands.
Base SAS software, as the name suggests is a base of all SAS software because it enables you to work on other SAS products on applying the same basic rules. Base SAS is a web-based integrated system of software solutions that not only has an extensible 4GL programming language of its own but also allows you to access, manage and transform data.
It also allows you to analyze data and generate reports with graphics. It not only allows you to develop applications, manage projects, perform statistical and mathematical analysis, but also allows you to do business forecasting and decision making.
The three main components of Base SAS are:
- Data Management Utility
- Programming Language
- Data Analysis and Reporting
SAS organizes data into SAS Data Set, which is quite similar to DBMS tables. Each row in a Data Set is a separate entity and referred to asobservation,each column is referred to as variable and each separate piece of information in a Data Set is called data value. You need to use SAS Programming language to build SAS Data Set or generate reports.
Benefits of Using Base SAS
Base SAS has following benefits:
- Data Integration: Base SAS has an open and cross-platform architecture that allows you to perform data integration across different computing environments and infrastructures.This allows developers to read, format, analyze and create reports using data quickly, without bothering about the format of the data. This allows developers to focus their efforts to get a single view of the data.
- Flexible and Powerful: Base SAS is flexible and powerful because it is scalable, allows parallel processing and multi-threading. Its high performance capabilities and the reduced need to move the data and execute the data manipulation routines in in-database or in-Hadoopallows it to produce faster resultsand improve performance and security.
- Easy to Learn: Base SAS is easy to learn and intuitive. It uses procedures that encapsulate the functionality that can be called using simple commands. The SAS Studio allows you to access existing programs, libraries, and data files from any mobile device that has a web browser. It allows you to generate reports using statistical procedures and view them in many standard formats such as RTF, PDF, Power Point, HTML and e-book formats. It also allows you to use platforms such as iPad, iPhone, and iBooks to view these reports.
In the today’s era of data explosion, the comprehension of the massive amount of unprecedented datasets have emerged by both the big and small companies to survive competition. Not only data analysis but data visualization has also become essential for companies to bring out meaningful results from unlimited data. Big data has emerged as a game changer and has replaced the traditional data processing ways that goes beyond Excel charts, graphs, and pivot tables – the data visualization tools.
Big data allows you to make real time personalization by making it possible for you to track the behavior of individual customers from the Internet clicks they make. Based on their websites surfing behavior, businesses canoffer their consumers products and services they need.
The two most common technologies that come to mind when you think about Big Data is Apache Hadoop and Apache Spark. Although, both technologies are used for distributed storage and to process big data sets, none of them is the replacement of the other.
Hadoop has emerged as a mature and cost effective way to manage large data sets with its MapReduce programming model. MapReduce is best suited for fast data processing of static data. It does batch processing of static data in most effective way. However, it is not suitable for iterative work. Hadoop was initially used for logs processing and analyzing. Hadoop uses HDFS and works with datasets loaded from disks, which makes its performance slow because of continuous disk IO operations, data serialization and replication of data in HDFS.
Apache Spark is capable of working with large datasets in memory itself and can process terabytes of real time streaming data, across number of machines. Spark is easier to learn and use as compared to Hadoop. It is 100 times faster than Hadoop and does not use MapReduce execution engine but uses its own distributed runtime for generalized computation. It uses iterative shell and its API supports multiple languages.
Spark is best suited for applications that runs on iterative model and performs continuous read and write operations. It is best to be used in situations where data is few hundred gigs and can fit in memory.
However, the major disadvantage of using Spark is that it does not have a storage layer and uses Hadoop HDFS. In addition it is less fault tolerant as compared to Hadoop. So if you need high fault tolerance than you should use Hadoop instead of Spark.
Hadoop and Spark Collaboration
Hadoop and Spark can be used in collaboration in scenarios where you have to perform many data transformations from huge data sets. You need intermediate data storage for data processing and also need to store data on disk. In such a scenario, you can use Spark for intermediate processing of data in memory and Hadoop to store data on disk. Using any one technology in such a scenario can give you a tough time because Sparks alone is not fit to store production workloads and Hadoop alone cannot provide fast execution time. Using both technologies together can give you smooth data flow and faster data processing.