Posts

Showing posts from June, 2019

20 Types of Visualisation

Being able to Visualise data is very important in making the information held within easily digestible to us. Being able to see the data represented visually is much easier to understand for us, so we turn to Visualisation solutions to do this for us. One example of a Big Data Visualisation solution is JupyteR, an open-source project which offers analysis and visualisation. It accepts programming input from a selection of languages and uses said code in order to provide a chosen method of Visualisation. There is also functionality to share and collaborate upon these visualisations. Another solution is Tableau, which leans into Visualising Big Data in the context of machine learning analysis and what this produces, offering a range of integrations with various Big Data cloud services. You can read more about visualisation solutions here: https://towardsdatascience.com/top-4-popular-big-data-visualization-tools-4ee945fe207d

19 Data Mining Methods

Data Mining is the practice of deriving information from large amounts of existing data. This can be done in a variety of ways. One of these days is by creation Associations between sets of data. This is a common practice in retail as drawing connections between customer purchases allows businesses to utilise these behaviours in order to encourage more purchases by placing associated items near one another, or offering deals. Classification is another Data Mining practice wherein common factors can be identified in order to group entities together and apply predictions on one such entry across the broader category. This was touched upon briefly in an earlier blog, in reference to how recruitment can be aided by identifying common factors among successful applicants. For more information you can read here: https://datafloq.com/read/5-major-data-mining-techniques-being-used-big-data/3352

18 Types of Problem Suited to Big Data Analysis

There are a lot of problems where Big Data Analysis can provide a solution. One such area is in logistics. Mechanical failures of delivery vehicles represents a problem as it costs money to repair the vehicles, and this wastes the time of both the company and the customer who receive their items late. Big Data Analysis allows logistics companies to take preventative measures by monitoring all of their vehicles and identifying when their components are likely to fail, and by doing so they are able to ensure that their components are replaced at the optimum time for maintaining time and cost efficiency. For other examples of problems Big Data can solve, read here: https://www.forbes.com/sites/gregsatell/2013/12/03/yes-big-data-can-solve-real-world-problems/#416d707a8896

17 Strategies for Limiting the Negative Effects of Big Data

As mentioned earlier when discussing Big Data's implications for individuals, there is a high potential risk to personal privacy associated with the increase in personal data being stored about us everywhere we go. This cumulative data profile can hold a dangerous amount of personal information, and as such it is important to demonstrate responsibility with the data that is handled. In order to protect the individual from the malicious exploitation of their data, it is common practice to attempt to anonymise stored personal data in order to ensure that this can be used for things like pattern recognition and inferential analytics without exposing the individual to undue risk of their data being compromised. However, this is often insufficient as correlation can still happen by comparing multiple data sources and identification of individuals can still be done. You can read more here: https://www.ft.com/content/105e30a4-2549-11e3-b349-00144feab7de

16 Implications of Big Data for Society

As mass adoption of Big Data continues, the potential societal implications are massive, and may extend far beyond the understanding of the average individual. Big Data has the potential to change the way our world functions, not in the least in our economy. Across the world, in areas like health care, administration and IT, Big Data has the potential to increase value and spending in the hundreds of billions, as well as creating a significant amount of jobs in analysis. You can read more here: https://marksbigdatablog.blogspot.com/2019/06/implications-of-big-data-for-society.html

15 Implications of Big Data for Individuals

As individuals, we of course benefit from the Societal and Scientific Big Data applications that were discussed earlier. On a solely individual basis it can be left up to interpretation whether the tailored content that is presented to us in search engines and advertisements is a good thing or a bad thing. Some people certainly are not thrilled that their data is being used in order to frame the content that is displayed to them, or to sell them things. The fact that our data can be used in ways that we may not be comfortable with, without our express consent is one major disadvantage of the onset of Big Data. It has significant implications for privacy and personal security, as our personal information is being collected, and if this information is compromised it could be a big problem. 

14 Limitations of Predictive Analytics

One limitation of Predictive Analytics is the current inability to account for human decision making, even with a substantial data set to work with. In the last Presidential Election in the United States, a variety of analytics sources placed a 70% probability that the outcome of the election would be in favour of Hillary Clinton. While concluding that the 70% chance was incorrect is somewhat of a misunderstanding of probability, the consistency with which similar predictions were made indicates that there are factors that predictive analytics can not yet comprehend, and as such it should not be relied upon solely in matters such as these. You can read more about this here: https://www.dataversity.net/limitations-predictive-analytics-lessons-data-scientists/#

13 Technological Requirements of Big Data

As we've said several times now, the sheer volume of data being produced every day is massive, and it never stops. As a result of this, storing all of our data is a massive task that may need specialised solutions, potentially including cloud backups in order to remove the potential for losing the data.  With storage demands being met, the next task is being able to process the data itself, for example if there are particular time frames in which a larger volume of data is expected than usual, it may be required to anticipate this and provide greater processing power to handle this. None of the above matter unless we know how to derive meaning from the data, which is where analysis comes into the picture. We've mentioned types of analysis before, and the reason this is important, but one analytical application could be to predict future trends based on the data set. As these massive data sets will likely be accessed from a variety of locations, the ability to transpo...

12 Future Applications of Big Data

Future applications of Big Data are likely to rely on finding ways to encourage users to provide data to us in a structure that is easy to parse, things like star ratings are easy to apply metrics to in order to extrapolate some sort of meaning from the data we are provided and as such it's important that we find ways to frame our data collection in such a fashion. It is also likely in the future that the onset of newer and more readily available technologies will present new avenues for data collection that are not currently open to us. It would be prudent to remain mindful of the implications of emergent technologies with respect to the opportunities they represent for data collection, and in the case of machine learning, the massive opportunity for automated analysis of this data.

11 Contemporary Applications of Big Data in Society

Big Data can be applies across a range of societal avenues in order to make our lives easier. One example of this is in allowing us to create more reliable and accurate weather predictions. The array of satellites and sensors that blanket the world are constantly producing a torrent of data, and this can be used to identify environmental patterns that lead to all types of weather conditions. This could be as simple as helping us to know when it's going to rain, or as major as being able to better anticipate truly hazardous weather conditions like hurricanes and other natural disasters. Another area in which Big Data can help society is in allowing those involved in the Transportation Industry to better identity factors affecting traffic, in turn allowing them to identify in real time factors that could lead to traffic congestion and use this to take preventative measures to avoid traffic jams and even reduce the risk of accidents. For more information you can read here: https...

10 Contemporary Applications of Big Data in Science

Big Data represents a fantastic opportunity in the field of science to extract meaning from the avalanche of data that we create. One example of a scientific application of Big Data is the opportunity of using it to fight disease. Current diagnostic efforts for people suffering from diseases like cancer produce close to a terabyte of data per person. All of this data may hold vital information on which great strides may be taken towards better understanding these afflictions and identifying ways in which we can combat them. Needless to say finding these crucial strands of information is of the utmost importance, and utilising Big Data to the fullest will be required in order to do so. For more information you can read here: https://www.canwelivebetter.bayer.com/innovation/finding-cure-cancer-big-data-solution

09 Contemporary Applications of Big Data in Business

Finding new ways to utilise Big Data to business is something companies are always looking to do. With a large amount of money being made online via advertising, companies are using big data in order to better target their ads to consumers in order to increase the rate at which we engage with these ads. Similarly, online retailers may create a profile of their customers using Big Data in order to expose them to more products they are likely to buy. In workplace recruitment, someone's data profile can be compared to past hires in order to determine whether they share similarities with successful past recruits, or conversely if they share characteristics with hires that have been unsuccessful. This would allow recruiters to filter out prospective candidates to theoretically ensure a higher quality overall. For more information, read here: https://www.searchtechnologies.com/blog/big-data-use-cases-for-business

06 Traditional Statistics

Traditional Statistics rely on the data we store in smaller scale, fixed structure formats which are only able to provide a small insight into a given problem. Descriptive statistics refers to data analytics in a way that we may identify patterns from the data, but said patterns can only be said to apply to the data set provided, and can not necessarily be used to make broader conclusions on a subject. Inferential statistics however are employed by feeding in a sample of a much larger data set, and using the conclusions drawn from this to infer patterns we may apply to the broader data set. For more information about Descriptive and Inferential statistics, read here: https://statistics.laerd.com/statistical-guides/descriptive-inferential-statistics.php

08 Characteristics of Big Data Analysis

Analysing Big Data has become massively important in the modern business world, as companies have spent years accumulating data with no practical means of inferring information from it. Analysis of Big Data can fall into one of two categories: Decision-oriented and Action-oriented. Decision-oriented Analysis in general is when we are looking for an answer to a problem, and we look to infer a solution to this problem from our data by noticing patterns or trends to inform our decisions. Action-oriented Analysis lends itself more to monitoring data and watching for some specific circumstances that demand a response. For example, if there is a sudden upturn in purchases for a certain type of product, we are able to recognise this and react. For more information, read here: https://www.dummies.com/programming/big-data/data-science/characteristics-of-big-data-analysis/

07 Limitations of Traditional Data Analysis

The Traditional Data Analytics we discussed before have value in the context of providing some way for educated business analysts to interpret a large and unwieldy data set, but this has clear limits in the current data climate. Traditional database systems struggle greatly to handle the sheer size of data we are dealing with in the modern era, and how quickly and how often this data is generated. In addition to this, they also struggle with the variety of unstructured data, as the range of data bleeding out from communications and other forms of media is impossible for a relational database to categorise. You can read more about these limitations here: https://marketrealist.com/2014/07/traditional-database-systems-fail-support-big-data/

05 Value of Data

In our slides we were shown in class, it was pointed out that the reason Facebook is worth more than several lucrative companies such as McDonald's despite only generation $1 billion worth of revenue is due to the value it's data stores hold, now and in the future. While so far we have focused on the sheer exponential volume of data that is becoming available in the modern world, the value of this data has not been as much of a focus. Making a conscious effort to utilise and profit off of the data that companies generate is set to become a necessity, and it is predicted that all companies will be purchasing external data to enhance their business by next year. For more information, you can read here: https://datamakespossible.westerndigital.com/value-of-data/

04 Reasons for the Growth of Data

Naturally, the prominent reason for the growth of data has been the technology we employ in the modern world. Between phones, computers, tablets, game consoles and all other types of home media, our lives are entangled in devices that make note of every interaction we have with them.  Our online searches, browsing habits, communications, media preferences and so on all feed into a well of data that companies are making every effort to store and interpret. It is estimated that by the year 2025, 463 exabytes of data will be generated every day across the world. For more information about the daily generation of data, read here: https://www.visualcapitalist.com/how-much-data-is-generated-each-day/

03 Growth of Data

The amount of data being produced in the present day is growing at an alarming rate. In the slides we were shown in class, we were told that 90% of the data in the world has been produced in the last two years. This exponential growth shows no signs of stopping, with many sources on the subject agreeing that we can expect the amount of existing data to double as regularly as every two years. Right now the 'digital universe' contains 2.7 Zetabytes of data, and the ways in which this is growing are unmentionable. For one example, it has been estimated by the IDC that by next year, there will be 450 billion online transactions on a daily basis. For more information you can read: https://insidebigdata.com/2017/02/16/the-exponential-growth-of-data/ https://waterfordtechnologies.com/big-data-interesting-facts/

02 Historical Development of Big Data

In general, the volume of data available from many avenues and by extension the struggle to interpret it has been exponentially increasing for many years now.  In the United States, when the Social Security Act came into effect in 1937, monitoring of contributions from millions of citizens and employers was a massive task that was entrusted to IBM. At the time, the solution they turned to to handle the demands of this task was to develop a machine that read information from punch cards. In the 82 years since this development, the volume of available data has accelerated to an incredible degree. Through warfare, security, finance and the ultimately the onset of the internet, we have been striving to collect more and more data. In the present day, the volume of available data is more bloated than ever before, in particular as a result of social media. Now the task presented to organisations is to interpret and utilise the unprecedented amount of data available to them, and...