by Mark Kelly - VCE Applied Computing, VCE Data Analytics, VCE Software Development

Write to  Mark Kelly


Techniques for efficient and effective data collection

For VCE Data Analytics


Techniques for efficient and effective data collection, including methods to collect

  • census [data],
  • Geographic Information System (GIS) data,
  • sensor [data],
  • social media [data] and
  • weather [data]

[Note: I have edited the KK punctuation to clarify its ambiguous punctuation. Let me know if you read it differently.] slideshow General Data collection techniques slideshow Geographic Information Systems (new for 202

Census data

I can't research this until an ambiguity in the study design has been cleared up.

When the KK refers to "Techniques for efficient and effective data collection, including methods to collect census [data]..." is it referring to techniques used by

  • the government Census Department to collect data (e.g. their paper-based and online census forms), or
  • techniques used by citizens wanting to collect data from the government's census data (e.g. an API, browsing

sensor [data],

Sensor data is the output of a device that detects and responds to some type of input from the physical environment. The output may be used to provide information or input to another system or to guide a process.

Sensors can be used to detect just about any physical element. Here are a few examples of sensors, just to give an idea of the number and diversity of their applications:

  • An accelerometer detects changes in gravitational acceleration in a device it's installed in, such as a smartphone or a game controller,  to determine acceleration, tilt and vibration.
  • A photosensor detects the presence of visible light, infrared transmission (IR) and/or ultraviolet (UV) energy. 
  • Lidar, a laser-based method of detection, range finding and mapping, typically uses a low-power, eye-safe pulsing laser working in conjunction with a camera.
  • A charge-coupled device (CCD) stores and displays the data for an image in such a way that each pixel  is converted into an electrical charge, the intensity of which is related to a color in the color spectrum.
  • Smart grid sensors can provide real-time data about grid conditions, detecting outages, faults and load and triggering alarms.

Wireless sensor networks combine specialized transducers with a communications infrastructure for monitoring and recording conditions at diverse locations. Commonly monitored parameters include temperature, humidity, pressure, wind direction and speed, illumination intensity, vibration intensity, sound intensity, power-line voltage, chemical concentrations, pollutant levels and vital body functions.

Sensor data is in integral component of the increasing reality of the Internet of Things (IoT) environment. In the IoT scenario, almost any entity imaginable can be outfitted with a unique identifier (UID) and the capacity to transfer data over a network. Much of the data transmitted is sensor data.

Top 10 IoT Sensor Types

  • Temperature Sensors. Temperature sensors measure the amount of heat energy in a source, allowing them to detect temperature changes and convert these changes to data. ...
  • Humidity Sensors. ...
  • Pressure Sensors. ...
  • Proximity Sensors. ...
  • Level Sensors. ...
  • Accelerometers. ...
  • Gyroscope. ...
  • Gas Sensors.

What is sensor data in data analytics?

Sensor data analytics is an analytics platform built to analyse the data streamed or collected from sensors and IoT devices. The data is analysed to give insight into the current status of this device using different metrics (these metrics are set based on the organisation's needs)

Sensor data collection

Often, sensor data is tasked with capturing information relevant to a particular task, so the data can be used to make process improvements for the purpose of saving money or increasing efficiency. Sensors are connected through gateways, which enable them to relay the collected data to a server in the cloud.

A set of sensors (e.g. touchscreen, webcam, microphone and others) will be used to gather information about the way elders interact with the workspace aiming at inferring information about the cognitive abilities and engagement level such as potential impairments that decrease the interaction quality, problems to focus on the screen, difficulties when reading what it is shown in the screen, problems to manage the keyboard or the items on the screen, etc. Analyzing the state of the art approaches we have identified potential age related problems regarding vision, hearing, memory, attention, coordination and locomotors.

Regular monitoring of livestock managed in extensive grazing systems is essential for the animal's welfare and productivity. However, inspecting livestock routinely by direct observation or measurement is a costly and onerous task for farmers managing large herds across extensive agricultural landscapes [1]. It is widely accepted that relationships exist between grazing behaviour and feed supply. However, factors affecting this behaviour are still poorly understood, and relationships may be influenced by the characteristics of the paddock environment, flock structure and type of livestock. Livestock has been found to respond to decreased sward biomass by increasing grazing time, reducing time idling, increasing distance walked and lessening bites taken at each feeding station [2]. Sward structure also affects animal daily forage intake [3]. Thus, our understanding and use of this information are likely to benefit substantially from developments in sensor technologies and new analytical methods.

The use of sensors for monitoring livestock has opened up new possibilities for the management of livestock in extensive grazing systems. The work presented in this paper aimed to develop a model for predicting the metabolisable energy intake (MEI) of sheep by using temperature, pitch angle, roll angle, distance, speed, and grazing time data obtained directly from wearable sensors on the sheep.

Automated collection of human behavior is one of the recent developments in data collection field. Companies can analyze the behaviors of their customers and get insight into their needs by using automated collection technology. In this study, we analyze location-based services data collected from a major shopping mall in İstanbul.

Traditional field-based data acquisition of avalanche activity has limitations. • We review advances, potential, and limitation of remote sensing of avalanches. • We review terrestrial, airborne, and spaceborne radar, optical, and LiDAR sensors.

There is considerable interest in detecting people crossing the border with fewer false alarms and high confidence. This capability requires understanding the phenomenology of various sensor modalities and developing algorithms based on the phenomenology. In an effort to develop this capability, U.S. Army Research Laboratory scientists went to the southwest border to collect data using acoustic, seismic, passive infrared IR, profiling, electric field, magnetic field, radar, sonar, visible, and IR imaging sensors. In this report, we discuss the data collection effort and resultant data, phenomenology of various sensor modalities, and robust detection algorithms. In the future, the acoustic sensor data will be processed to determine the characteristic features of human voice formants, etc., the seismic data will be processed using the ground transfer function to determine the cadence of the person walking as opposed to an animal, and the radar and ultrasonic data will be processed to determine the Doppler frequency resulting from various limb movements.

The sudden infant death syndrome (SIDS) is an expert diagnosis when an apparently healthy baby dies without explanation. When physicians or coroners cannot explain the cause of death it is classified as sudden death. This paper reviews the related literature and proposes a mobile solution based on biofeedback monitoring that tries to prevent the sudden death in infants. The sudden death system uses real-time data collection from sensors to diagnose, in advance, baby health problems and prevent those are take care for a baby. When an issue is detected by this system (i.e., the sensors send abnormal data), it sends a warning to those responsible for the baby. It allows the access to data from sensors and their analysis in real-time (such as, the baby position and the crib). Signal processing algorithms are used in real-time to prevent a sudden death. Mobile devices (such as, smartphones or tablets) are used to process the sensed data and monitoring a baby performing alerts/warnings when an abnormal situation is detected.

This paper presents the development of wireless sensor monitoring system for environmental applications. The system is based on wireless ZigBee technology and uses 32-Bit Arduino Uno microcontroller for monitoring of environmental parameters measurements online such as carbon dioxide, oxygen, temperature and humidity levels. The CO2, O2, humidity, and temperature sensors are integrated into the data acquisition system. Data was collected at a certain location at Universiti Sains Malaysia, consistency models are define for analyzing the quality of data and the level of carbon dioxide and oxygen in the deployed environment. The results show that the system is capable of monitoring and analysis of CO2 and O2 in the deployed environment and this success shows the potential of this system for application in environment where reliable gas monitoring is crucial.


social media [data]

This paper presents the first criminological analysis of an online social reaction to a crime event of national significance, in particular the detection and propagation of cyberhate on social media following a terrorist attack. We take the Woolwich, London terrorist attack in 2013 as our event of interest and draw on Cohen's process of warning, impact, inventory and reaction to delineate a sequence of incidents that come to constitute a series of deviant responses following the attack. This paper adds to contemporary debates in criminology and the study of hate crime in three ways: (1) it provides the first analysis of the escalation, duration, diffusion and de-escalation of cyberhate in social media following a terrorist event; (2) it applies Cohen's work on action, reaction and amplification and the role of the traditional media to the online context and (3) it introduces and provides a case study in 'computational criminology'.

This article discusses how social media research may benefit from social media companies making data available to researchers through their application programming interfaces (APIs). An API is a back-end interface through which third-party developers may connect new add-ons to an existing service. The API is also an interface for researchers to collect data off a given social media service for empirical analysis. Presenting a critical methodological discussion of the opportunities and challenges associated with quantitative and qualitative social media research based on APIs, this article highlights a number of general methodological issues to be dealt with when collecting and assessing data through APIs. The article further discusses the legal and ethical implications of empirical research using APIs for data collection.

In recent years, jihadist terrorist movements have used varying social media platforms to organize, coordinate operations, and spread propaganda. Significant research has focused on understanding these online activities. Such research naturally requires the collection and analysis of social media data produced primarily by jihadist users. The findings of such studies are only as valid as the data they are based on - which is the topic of the present study. In a comprehensive survey of the largest body of research on online jihadist activity - that of jihadist activity on Twitter - we find a wide array of data collection methods. In the majority of cases, we find that these studies fail to acknowledge limitations of the data collection methods, raising serious concerns about the validity of their findings. Similar issues exist in studies that consider other social media platforms. There are known standards of practice and established methodologies for addressing these issues in the field of computer science that it would be useful for scholars of terrorism to become familiar with and apply to future work.

The emergence and ubiquity of online social networks have enriched web data with evolving interactions and communities both at mega-scale and in real-time. This data offers an unprecedented opportunity for studying the interaction between society and disease outbreaks. The challenge we describe in this data paper is how to extract and leverage epidemic outbreak insights from massive amounts of social media data and how this exercise can benefit medical professionals, patients, and policymakers alike.


weather [data]



Write to Mark Kelly



Go back to wherever you were before this page

All original content copyright ©
All rights reserved.

This page was created on 2022-04-07 @ 13:00
Last modified on Thursday 7 April, 2022 13:22