Articles de revues sur le sujet « Crowdsourcing, classification, task design, crowdsourcing experiments »

Pour voir les autres types de publications sur ce sujet consultez le lien suivant : Crowdsourcing, classification, task design, crowdsourcing experiments.

Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres

Choisissez une source :

Consultez les 34 meilleurs articles de revues pour votre recherche sur le sujet « Crowdsourcing, classification, task design, crowdsourcing experiments ».

À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.

Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.

Parcourez les articles de revues sur diverses disciplines et organisez correctement votre bibliographie.

1

Yang, Keyu, Yunjun Gao, Lei Liang, Song Bian, Lu Chen et Baihua Zheng. « CrowdTC : Crowd-powered Learning for Text Classification ». ACM Transactions on Knowledge Discovery from Data 16, no 1 (3 juillet 2021) : 1–23. http://dx.doi.org/10.1145/3457216.

Texte intégral
Résumé :
Text classification is a fundamental task in content analysis. Nowadays, deep learning has demonstrated promising performance in text classification compared with shallow models. However, almost all the existing models do not take advantage of the wisdom of human beings to help text classification. Human beings are more intelligent and capable than machine learning models in terms of understanding and capturing the implicit semantic information from text. In this article, we try to take guidance from human beings to classify text. We propose Crowd-powered learning for Text Classification (CrowdTC for short). We design and post the questions on a crowdsourcing platform to extract keywords in text. Sampling and clustering techniques are utilized to reduce the cost of crowdsourcing. Also, we present an attention-based neural network and a hybrid neural network to incorporate the extracted keywords as human guidance into deep neural networks. Extensive experiments on public datasets confirm that CrowdTC improves the text classification accuracy of neural networks by using the crowd-powered keyword guidance.
Styles APA, Harvard, Vancouver, ISO, etc.
2

Ramírez, Jorge, Marcos Baez, Fabio Casati et Boualem Benatallah. « Understanding the Impact of Text Highlighting in Crowdsourcing Tasks ». Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 7 (28 octobre 2019) : 144–52. http://dx.doi.org/10.1609/hcomp.v7i1.5268.

Texte intégral
Résumé :
Text classification is one of the most common goals of machine learning (ML) projects, and also one of the most frequent human intelligence tasks in crowdsourcing platforms. ML has mixed success in such tasks depending on the nature of the problem, while crowd-based classification has proven to be surprisingly effective, but can be expensive. Recently, hybrid text classification algorithms, combining human computation and machine learning, have been proposed to improve accuracy and reduce costs. One way to do so is to have ML highlight or emphasize portions of text that it believes to be more relevant to the decision. Humans can then rely only on this text or read the entire text if the highlighted information is insufficient. In this paper, we investigate if and under what conditions highlighting selected parts of the text can (or cannot) improve classification cost and/or accuracy, and in general how it affects the process and outcome of the human intelligence tasks. We study this through a series of crowdsourcing experiments running over different datasets and with task designs imposing different cognitive demands. Our findings suggest that highlighting is effective in reducing classification effort but does not improve accuracy - and in fact, low-quality highlighting can decrease it.
Styles APA, Harvard, Vancouver, ISO, etc.
3

Guo, Shikai, Rong Chen, Hui Li, Tianlun Zhang et Yaqing Liu. « Identify Severity Bug Report with Distribution Imbalance by CR-SMOTE and ELM ». International Journal of Software Engineering and Knowledge Engineering 29, no 02 (février 2019) : 139–75. http://dx.doi.org/10.1142/s0218194019500074.

Texte intégral
Résumé :
Manually inspecting bugs to determine their severity is often an enormous but essential software development task, especially when many participants generate a large number of bug reports in a crowdsourced software testing context. Therefore, boosting the capabilities of methods of predicting bug report severity is critically important for determining the priority of fixing bugs. However, typical classification techniques may be adversely affected when the severity distribution of the bug reports is imbalanced, leading to performance degradation in a crowdsourcing environment. In this study, we propose an enhanced oversampling approach called CR-SMOTE to enhance the classification of bug reports with a realistically imbalanced severity distribution. The main idea is to interpolate new instances into the minority category that are near the center of existing samples in that category. Then, we use an extreme learning machine (ELM) — a feedforward neural network with a single layer of hidden nodes — to predict the bug severity. Several experiments were conducted on three datasets from real bug repositories, and the results statistically indicate that the presented approach is robust against real data imbalance when predicting the severity of bug reports. The average accuracies achieved by the ELM in predicting the severity of Eclipse, Mozilla, and GNOME bug reports were 0.780, 0.871, and 0.861, which are higher than those of classifiers by 4.36%, 6.73%, and 2.71%, respectively.
Styles APA, Harvard, Vancouver, ISO, etc.
4

Baba, Yukino, Hisashi Kashima, Kei Kinoshita, Goushi Yamaguchi et Yosuke Akiyoshi. « Leveraging Crowdsourcing to Detect Improper Tasks in Crowdsourcing Marketplaces ». Proceedings of the AAAI Conference on Artificial Intelligence 27, no 2 (6 octobre 2021) : 1487–92. http://dx.doi.org/10.1609/aaai.v27i2.18987.

Texte intégral
Résumé :
Controlling the quality of tasks is a major challenge in crowdsourcing marketplaces. Most of the existing crowdsourcing services prohibit requesters from posting illegal or objectionable tasks. Operators in the marketplaces have to monitor the tasks continuously to find such improper tasks; however, it is too expensive to manually investigate each task. In this paper, we present the reports of our trial study on automatic detection of improper tasks to support the monitoring of activities by marketplace operators. We perform experiments using real task data from a commercial crowdsourcing marketplace and show that the classifier trained by the operator judgments achieves high accuracy in detecting improper tasks. In addition, to reduce the annotation costs of the operator and improve the classification accuracy, we consider the use of crowdsourcing for task annotation. We hire a group of crowdsourcing (non-expert) workers to monitor posted tasks, and incorporate their judgments into the training data of the classifier. By applying quality control techniques to handle the variability in worker reliability, our results show that the use of non-expert judgments by crowdsourcing workers in combination with expert judgments improves the accuracy of detecting improper crowdsourcing tasks.
Styles APA, Harvard, Vancouver, ISO, etc.
5

Ceschia, Sara, Kevin Roitero, Gianluca Demartini, Stefano Mizzaro, Luca Di Gaspero et Andrea Schaerf. « Task design in complex crowdsourcing experiments : Item assignment optimization ». Computers & ; Operations Research 148 (décembre 2022) : 105995. http://dx.doi.org/10.1016/j.cor.2022.105995.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
6

Sun, Yuyin, Adish Singla, Tori Yan, Andreas Krause et Dieter Fox. « Evaluating Task-Dependent Taxonomies for Navigation ». Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 4 (21 septembre 2016) : 229–38. http://dx.doi.org/10.1609/hcomp.v4i1.13286.

Texte intégral
Résumé :
Taxonomies of concepts are important across many application domains, for instance, online shopping portals use catalogs to help users navigate and search for products. Task-dependent taxonomies, e.g., adapting the taxonomy to a specific cohort of users, can greatly improve the effectiveness of navigation and search. However, taxonomies are usually created by domain experts and hence designing task-dependent taxonomies can be an expensive process: this often limits the applications to deploy generic taxonomies. Crowdsourcing-based techniques have the potential to provide a cost-efficient solution to building task-dependent taxonomies. In this paper, we present the first quantitative study to evaluate the effectiveness of these crowdsourcing based techniques. Our experimental study compares different task-dependent taxonomies built via crowdsourcing and generic taxonomies built by experts. We design randomized behavioral experiments on the Amazon Mechanical Turk platform for navigation tasks using these taxonomies resembling real-world applications such as product search. We record various metrics such as the time of navigation, the number of clicks performed, and the search path taken by a participant to navigate the taxonomy to locate a desired object. Our findings show that task-dependent taxonomies built by crowdsourcing techniques can reduce the navigation time up to $20\%$. Our results, in turn,demonstrate the power of crowdsourcing for learning complex structures such as semantic taxonomies.
Styles APA, Harvard, Vancouver, ISO, etc.
7

Lin, Christopher, Mausam Mausam et Daniel Weld. « Dynamically Switching between Synergistic Workflows for Crowdsourcing ». Proceedings of the AAAI Conference on Artificial Intelligence 26, no 1 (20 septembre 2021) : 87–93. http://dx.doi.org/10.1609/aaai.v26i1.8121.

Texte intégral
Résumé :
To ensure quality results from unreliable crowdsourced workers, task designers often construct complex workflows and aggregate worker responses from redundant runs. Frequently, they experiment with several alternative workflows to accomplish the task, and eventually deploy the one that achieves the best performance during early trials. Surprisingly, this seemingly natural design paradigm does not achieve the full potential of crowdsourcing. In particular, using a single workflow (even the best) to accomplish a task is suboptimal. We show that alternative workflows can compose synergistically to yield much higher quality output. We formalize the insight with a novel probabilistic graphical model. Based on this model, we design and implement AGENTHUNT, a POMDP-based controller that dynamically switches between these workflows to achieve higher returns on investment. Additionally, we design offline and online methods for learning model parameters. Live experiments on Amazon Mechanical Turk demonstrate the superiority of AGENTHUNT for the task of generating NLP training data, yielding up to 50% error reduction and greater net utility compared to previous methods.
Styles APA, Harvard, Vancouver, ISO, etc.
8

Rothwell, Spencer, Steele Carter, Ahmad Elshenawy et Daniela Braga. « Job Complexity and User Attention in Crowdsourcing Microtasks ». Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 3 (28 mars 2016) : 20–25. http://dx.doi.org/10.1609/hcomp.v3i1.13265.

Texte intégral
Résumé :
This paper examines the importance of presenting simple, intuitive tasks when conducting microtasking on crowdsourcing platforms. Most crowdsourcing platforms allow the maker of a task to present any length of instructions to crowd workers who participate in their tasks. Our experiments show, however, most workers who participate in crowdsourcing microtasks do not read the instructions, even when they are very brief. To facilitate success in microtask design, we highlight the importance of making simple, easy to grasp tasks that do not rely on instructions for explanation.
Styles APA, Harvard, Vancouver, ISO, etc.
9

Qarout, Rehab, Alessandro Checco, Gianluca Demartini et Kalina Bontcheva. « Platform-Related Factors in Repeatability and Reproducibility of Crowdsourcing Tasks ». Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 7 (28 octobre 2019) : 135–43. http://dx.doi.org/10.1609/hcomp.v7i1.5264.

Texte intégral
Résumé :
Crowdsourcing platforms provide a convenient and scalable way to collect human-generated labels on-demand. This data can be used to train Artificial Intelligence (AI) systems or to evaluate the effectiveness of algorithms. The datasets generated by means of crowdsourcing are, however, dependent on many factors that affect their quality. These include, among others, the population sample bias introduced by aspects like task reward, requester reputation, and other filters introduced by the task design.In this paper, we analyse platform-related factors and study how they affect dataset characteristics by running a longitudinal study where we compare the reliability of results collected with repeated experiments over time and across crowdsourcing platforms. Results show that, under certain conditions: 1) experiments replicated across different platforms result in significantly different data quality levels while 2) the quality of data from repeated experiments over time is stable within the same platform. We identify some key task design variables that cause such variations and propose an experimentally validated set of actions to counteract these effects thus achieving reliable and repeatable crowdsourced data collection experiments.
Styles APA, Harvard, Vancouver, ISO, etc.
10

Fu, Donglai, et Yanhua Liu. « Fairness of Task Allocation in Crowdsourcing Workflows ». Mathematical Problems in Engineering 2021 (23 avril 2021) : 1–11. http://dx.doi.org/10.1155/2021/5570192.

Texte intégral
Résumé :
Fairness plays a vital role in crowd computing by attracting its workers. The power of crowd computing stems from a large number of workers potentially available to provide high quality of service and reduce costs. An important challenge in the crowdsourcing market today is the task allocation of crowdsourcing workflows. Requester-centric task allocation algorithms aim to maximize the completion quality of the entire workflow and minimize its total cost, which are discriminatory for workers. The crowdsourcing workflow needs to balance two objectives, namely, fairness and cost. In this study, we propose an alternative greedy approach with four heuristic strategies to address such an issue. In particular, the proposed approach aims to monitor the current status of workflow execution and use heuristic strategies to adjust the parameters of task allocation. We design a two-phase allocation model to accurately match the tasks with workers. The F-Aware allocates each task to the worker that maximizes the fairness and minimizes the cost. We conduct extensive experiments to quantitatively evaluate the proposed algorithms in terms of running time, fairness, and cost by using a customer objective function on the WorkflowSim, a well-known cloud simulation tool. Experimental results based on real-world workflows show that the F-Aware, which is 1% better than the best competitor algorithm, outperforms other optimal solutions in finding the tradeoff between fairness and cost.
Styles APA, Harvard, Vancouver, ISO, etc.
11

Cui, Lizhen, Xudong Zhao, Lei Liu, Han Yu et Yuan Miao. « Complex crowdsourcing task allocation strategies employing supervised and reinforcement learning ». International Journal of Crowd Science 1, no 2 (12 juin 2017) : 146–60. http://dx.doi.org/10.1108/ijcs-08-2017-0011.

Texte intégral
Résumé :
Purpose Allocation of complex crowdsourcing tasks, which typically include heterogeneous attributes such as value, difficulty, skill required, effort required and deadline, is still a challenging open problem. In recent years, agent-based crowdsourcing approaches focusing on recommendations or incentives have emerged to dynamically match workers with diverse characteristics to tasks to achieve high collective productivity. However, existing approaches are mostly designed based on expert knowledge grounded in well-established theoretical frameworks. They often fail to leverage on user-generated data to capture the complex interaction of crowdsourcing participants’ behaviours. This paper aims to address this challenge. Design/methodology/approach The paper proposes a policy network plus reputation network (PNRN) approach which combines supervised learning and reinforcement learning to imitate human task allocation strategies which beat artificial intelligence strategies in this large-scale empirical study. The proposed approach incorporates a policy network for the selection of task allocation strategies and a reputation network for calculating the trends of worker reputation fluctuations. Then, by iteratively applying the policy network and reputation network, a multi-round allocation strategy is proposed. Findings PNRN has been trained and evaluated using a large-scale real human task allocation strategy data set derived from the Agile Manager game with close to 500,000 decision records from 1,144 players in over 9,000 game sessions. Extensive experiments demonstrate the validity and efficiency of computational complex crowdsourcing task allocation strategy learned from human participants. Originality/value The paper can give a better task allocation strategy in the crowdsourcing systems.
Styles APA, Harvard, Vancouver, ISO, etc.
12

Kim, Yongsung, Emily Harburg, Shana Azria, Aaron Shaw, Elizabeth Gerber, Darren Gergle et Haoqi Zhang. « Studying the Effects of Task Notification Policies on Participation and Outcomes in On-the-go Crowdsourcing ». Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 4 (21 septembre 2016) : 99–108. http://dx.doi.org/10.1609/hcomp.v4i1.13275.

Texte intégral
Résumé :
Recent years have seen the growth of physical crowdsourcing systems (e.g., Uber; TaskRabbit) that motivate large numbers of people to provide new and improved physical tasking and delivery services on-demand. In these systems, opportunistically relying on people to make convenient contributions may lead to incomplete solutions, while directing people to do inconvenient tasks requires high incentives. To increase people's willingness to participate and reduce the need to incentivize participation, we study on-the-go crowdsourcing as an alternative approach that suggests tasks along people’s existing routes that are conveniently on their way. We explore as a first step in this paper the design of task notification policies that decide when, where, and to whom to suggest tasks. Situating our work in the context of practical problems such as package delivery and lost-and-found searches, we conducted controlled experiments that show how small changes in task notification policy can influence individual participation and actions in significant ways that in turn affect system outcomes. We discuss the implications of our findings on the design of future on-the-go crowdsourcing technologies and applications.
Styles APA, Harvard, Vancouver, ISO, etc.
13

Zeng, Zhiyuan, Jian Tang et Tianmei Wang. « Motivation mechanism of gamification in crowdsourcing projects ». International Journal of Crowd Science 1, no 1 (6 mars 2017) : 71–82. http://dx.doi.org/10.1108/ijcs-12-2016-0001.

Texte intégral
Résumé :
Purpose The purpose of this paper is to study the participation behaviors in the context of crowdsourcing projects from the perspective of gamification. Design/methodology/approach This paper first proposed a model to depict the effect of four categories of game elements on three types of motivation based upon several motivation theories, which may, in turn, influence user participation. Then, 5 × 2 between-subject Web experiments were designed for collecting data and validating this model. Findings Game elements which provide participants with rewards and recognitions or remind participants of the completion progress of their tasks may positively influence the extrinsic motivation, whereas game elements which can help create a fantasy scene may strengthen intrinsic motivation. Besides, recognition-kind and progress-kind game elements may trigger the internalization of extrinsic motivation. In addition, when a task is of high complexity, the effects from game elements on extrinsic motivation and intrinsic motivation will be less prominent, whereas the internalization of extrinsic motivation may benefit from the increase of task complexity. Originality/value This study may uncover the motivation mechanism of several different kinds of game elements, which may help to find which game elements are more effective in enhancing engagement and participation in crowdsourcing projects. Besides, as task complexity is used as a moderator, one may be able to identify whether task complexity is able to influence the effects from game elements on motivations. Last, but not the least, this study will indicate the interrelationship between game elements, individual motivation and user participation, which can be adapted by other scholars.
Styles APA, Harvard, Vancouver, ISO, etc.
14

Bu, Qiong, Elena Simperl, Adriane Chapman et Eddy Maddalena. « Quality assessment in crowdsourced classification tasks ». International Journal of Crowd Science 3, no 3 (2 septembre 2019) : 222–48. http://dx.doi.org/10.1108/ijcs-06-2019-0017.

Texte intégral
Résumé :
Purpose Ensuring quality is one of the most significant challenges in microtask crowdsourcing tasks. Aggregation of the collected data from the crowd is one of the important steps to infer the correct answer, but the existing study seems to be limited to the single-step task. This study aims to look at multiple-step classification tasks and understand aggregation in such cases; hence, it is useful for assessing the classification quality. Design/methodology/approach The authors present a model to capture the information of the workflow, questions and answers for both single- and multiple-question classification tasks. They propose an adapted approach on top of the classic approach so that the model can handle tasks with several multiple-choice questions in general instead of a specific domain or any specific hierarchical classifications. They evaluate their approach with three representative tasks from existing citizen science projects in which they have the gold standard created by experts. Findings The results show that the approach can provide significant improvements to the overall classification accuracy. The authors’ analysis also demonstrates that all algorithms can achieve higher accuracy for the volunteer- versus paid-generated data sets for the same task. Furthermore, the authors observed interesting patterns in the relationship between the performance of different algorithms and workflow-specific factors including the number of steps and the number of available options in each step. Originality/value Due to the nature of crowdsourcing, aggregating the collected data is an important process to understand the quality of crowdsourcing results. Different inference algorithms have been studied for simple microtasks consisting of single questions with two or more answers. However, as classification tasks typically contain many questions, the proposed method can be applied to a wide range of tasks including both single- and multiple-question classification tasks.
Styles APA, Harvard, Vancouver, ISO, etc.
15

Shin, Suho, Hoyong Choi, Yung Yi et Jungseul Ok. « Power of Bonus in Pricing for Crowdsourcing ». ACM SIGMETRICS Performance Evaluation Review 50, no 1 (20 juin 2022) : 43–44. http://dx.doi.org/10.1145/3547353.3522633.

Texte intégral
Résumé :
We consider a simple form of pricing for a crowdsourcing system, where pricing policy is published a priori, and workers then decide their task acceptance. Such a pricing form is widely adopted in practice for its simplicity, e.g., Amazon Mechanical Turk, although additional sophistication to pricing rule can enhance budget efficiency. With the goal of designing efficient and simple pricing rules, we study the impact of the following two design features in pricing policies: (i) personalization tailoring policy worker-by-worker and (ii) bonus payment to qualified task completion. In the Bayesian setting, where the only prior distribution of workers' profiles is available, we first study the Price of Agnosticism (PoA) that quantifies the utility gap between personalized and common pricing policies. We show that PoA is bounded within a constant factor under some mild conditions, and the impact of bonus is essential in common pricing. These analytic results imply that complex personalized pricing can be replaced by simple common pricing once it is equipped with a proper bonus payment. To provide insights on efficient common pricing, we then study the efficient mechanisms of bonus payment for several profile distribution regimes which may exist in practice. We provide primitive experiments on Amazon Mechanical Turk, which support our analytical findings[5].
Styles APA, Harvard, Vancouver, ISO, etc.
16

Shin, Suho, Hoyong Choi, Yung Yi et Jungseul Ok. « Power of Bonus in Pricing for Crowdsourcing ». Proceedings of the ACM on Measurement and Analysis of Computing Systems 5, no 3 (14 décembre 2021) : 1–25. http://dx.doi.org/10.1145/3491048.

Texte intégral
Résumé :
We consider a simple form of pricing for a crowdsourcing system, where pricing policy is published a priori, and workers then decide their task acceptance. Such a pricing form is widely adopted in practice for its simplicity, e.g., Amazon Mechanical Turk, although additional sophistication to pricing rule can enhance budget efficiency. With the goal of designing efficient and simple pricing rules, we study the impact of the following two design features in pricing policies: (i) personalization tailoring policy worker-by-worker and (ii) bonus payment to qualified task completion. In the Bayesian setting, where the only prior distribution of workers' profiles is available, we first study the Price of Agnosticism (PoA) that quantifies the utility gap between personalized and common pricing policies. We show that PoA is bounded within a constant factor under some mild conditions, and the impact of bonus is essential in common pricing. These analytic results imply that complex personalized pricing can be replaced by simple common pricing once it is equipped with a proper bonus payment. To provide insights on efficient common pricing, we then study the efficient mechanisms of bonus payment for several profile distribution regimes which may exist in practice. We provide primitive experiments on Amazon Mechanical Turk, which support our analytical findings.
Styles APA, Harvard, Vancouver, ISO, etc.
17

Yang, Yi, Yurong Cheng, Ye Yuan, Guoren Wang, Lei Chen et Yongjiao Sun. « Privacy-preserving cooperative online matching over spatial crowdsourcing platforms ». Proceedings of the VLDB Endowment 16, no 1 (septembre 2022) : 51–63. http://dx.doi.org/10.14778/3561261.3561266.

Texte intégral
Résumé :
With the continuous development of spatial crowdsourcing platform, online task assignment problem has been widely studied as a typical problem in spatial crowdsourcing. Most of the existing studies are based on a single-platform task assignment to maximize the platform's revenue. Recently, cross online task assignment has been proposed, aiming at increasing the mutual benefit through cooperations. However, existing methods fail to consider the data privacy protection in the process of cooperation and cause the leakage of sensitive data such as the location of a request and the historical data of cooperative platforms. In this paper, we propose Privacy-preserving Cooperative Online Matching (PCOM), which protects the privacy of the users and workers on their respective platforms. We design a PCOM framework and provide theoretical proof that the framework satisfies the differential privacy property. We then propose two PCOM algorithms based on two different privacy-preserving strategies. Extensive experiments on real and synthetic datasets confirm the effectiveness and efficiency of our algorithms.
Styles APA, Harvard, Vancouver, ISO, etc.
18

Jacques, Jason, et Per Ola Kristensson. « Crowdsourcing a HIT : Measuring Workers' Pre-Task Interactions on Microtask Markets ». Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 1 (3 novembre 2013) : 86–93. http://dx.doi.org/10.1609/hcomp.v1i1.13085.

Texte intégral
Résumé :
The ability to entice and engage crowd workers to participate in human intelligence tasks (HITs) is critical for many human computation systems and large-scale experiments. While various metrics have been devised to measure and improve the quality of worker output via task designs, effective recruitment of crowd workers is often overlooked. To help us gain a better understanding of crowd recruitment strategies we propose three new metrics for measuring crowd workers' willingness to participate in advertised HITs: conversion rate, conversion rate over time, and nominal conversion rate. We discuss how the conversion rate of workers—the number of potential workers aware of a task that choose to accept the task—can affect the quantity, quality, and validity of any data collected via crowdsourcing. We also contribute a tool — turkmill — that enables requesters on Amazon Mechanical Turk to easily measure the conversion rate of HITs. We then present the results of two experiments that demonstrate how conversion rate metrics can be used to evaluate the effect of different HIT designs. We investigate how four HIT design features (value proposition, branding, quality of presentation, and intrinsic motivation) affect conversion rates. Among other things, we find that including a clear value proposition has a strong significant, positive effect on the nominal conversion rate. We also find that crowd workers prefer commercial entities to non-profit or university requesters.
Styles APA, Harvard, Vancouver, ISO, etc.
19

Suissa, Omri, Avshalom Elmalech et Maayan Zhitomirsky-Geffet. « Toward the optimized crowdsourcing strategy for OCR post-correction ». Aslib Journal of Information Management 72, no 2 (9 décembre 2019) : 179–97. http://dx.doi.org/10.1108/ajim-07-2019-0189.

Texte intégral
Résumé :
Purpose Digitization of historical documents is a challenging task in many digital humanities projects. A popular approach for digitization is to scan the documents into images, and then convert images into text using optical character recognition (OCR) algorithms. However, the outcome of OCR processing of historical documents is usually inaccurate and requires post-processing error correction. The purpose of this paper is to investigate how crowdsourcing can be utilized to correct OCR errors in historical text collections, and which crowdsourcing methodology is the most effective in different scenarios and for various research objectives. Design/methodology/approach A series of experiments with different micro-task’s structures and text lengths were conducted with 753 workers on the Amazon’s Mechanical Turk platform. The workers had to fix OCR errors in a selected historical text. To analyze the results, new accuracy and efficiency measures were devised. Findings The analysis suggests that in terms of accuracy, the optimal text length is medium (paragraph-size) and the optimal structure of the experiment is two phase with a scanned image. In terms of efficiency, the best results were obtained when using longer text in the single-stage structure with no image. Practical implications The study provides practical recommendations to researchers on how to build the optimal crowdsourcing task for OCR post-correction. The developed methodology can also be utilized to create golden standard historical texts for automatic OCR post-correction. Originality/value This is the first attempt to systematically investigate the influence of various factors on crowdsourcing-based OCR post-correction and propose an optimal strategy for this process.
Styles APA, Harvard, Vancouver, ISO, etc.
20

Gao, Li-Ping, Tao Jin et Chao Lu. « A Long-Term Quality Perception Incentive Strategy for Crowdsourcing Environments with Budget Constraints ». International Journal of Cooperative Information Systems 29, no 01n02 (mars 2020) : 2040005. http://dx.doi.org/10.1142/s0218843020400055.

Texte intégral
Résumé :
Quality control is a critical design goal for crowdsourcing. However, when measuring the long-term quality of workers, the existing strategies do not make effective use of workers’ historical information, whereas others regard workers’ conditions as fixed values, even if they do not consider the impact of workers’ quality. This paper proposes a long-term quality perception incentive model (called QAI model) in a crowdsourcing environment with budget constraints. In this work, QAI divides the entire long-term activity cycle into multiple stages based on proportional allocation rules. Each stage treats the interaction between the requester and the worker as a reverse auction process. At each stage, a truthful, individually rational, budget feasible, quality-aware task allocation algorithm is designed. At the end of each stage, according to hidden Markov model (HMM), this paper proposes a new framework for quality prediction and parameter learning framework, which can make use of workers’ historical information efficiently. Experiments have verified the feasibility of our algorithm and showed that the proposed QAI model leads to improved results.
Styles APA, Harvard, Vancouver, ISO, etc.
21

Musi, Elena, Debanjan Ghosh et Smaranda Muresan. « ChangeMyView Through Concessions : Do Concessions Increase Persuasion ? » Dialogue & ; Discourse 9, no 1 (10 août 2018) : 107–27. http://dx.doi.org/10.5087/dad.2018.104.

Texte intégral
Résumé :
In Discourse Studies concessions are considered among those argumentative strategies that increase persuasion. We aim to empirically test this hypothesis by calculating the distribution of argumentative concessions in persuasive vs. non-persuasive comments from the the ChangeMyView subreddit. This constitutes a challenging task since concessions do not always bear an argumentative role and are expressed through polysemous lexical markers. Drawing from a theoretically-informed typology of concessions, we first conduct a crowdsourcing task to label a set of polysemous lexical markers as introducing an argumentative concession relation or not. Second, we present a self-training method to automatically identify argumentative concessions using linguistically motivated features. While we achieve a moderate F1 of 57.4% via the self-training method, our subsequent error analysis highlights that the self training method is able to generalize and identify other types of concessions that are argumentative, but were not considered in the annotation guidelines. Our findings from the manual labeling and the classification experiments indicate that the type of argumentative concessions we investigated is almost equally likely to be used in winning and losing arguments. While this result seems to contradict theoretical assumptions, we provide some reasons related to the ChangeMyView subreddit.
Styles APA, Harvard, Vancouver, ISO, etc.
22

Sayin, Burcu, Evgeny Krivosheev, Jie Yang, Andrea Passerini et Fabio Casati. « A review and experimental analysis of active learning over crowdsourced data ». Artificial Intelligence Review 54, no 7 (30 mai 2021) : 5283–305. http://dx.doi.org/10.1007/s10462-021-10021-3.

Texte intégral
Résumé :
AbstractTraining data creation is increasingly a key bottleneck for developing machine learning, especially for deep learning systems. Active learning provides a cost-effective means for creating training data by selecting the most informative instances for labeling. Labels in real applications are often collected from crowdsourcing, which engages online crowds for data labeling at scale. Despite the importance of using crowdsourced data in the active learning process, an analysis of how the existing active learning approaches behave over crowdsourced data is currently missing. This paper aims to fill this gap by reviewing the existing active learning approaches and then testing a set of benchmarking ones on crowdsourced datasets. We provide a comprehensive and systematic survey of the recent research on active learning in the hybrid human–machine classification setting, where crowd workers contribute labels (often noisy) to either directly classify data instances or to train machine learning models. We identify three categories of state of the art active learning methods according to whether and how predefined queries employed for data sampling, namely fixed-strategy approaches, dynamic-strategy approaches, and strategy-free approaches. We then conduct an empirical study on their cost-effectiveness, showing that the performance of the existing active learning approaches is affected by many factors in hybrid classification contexts, such as the noise level of data, label fusion technique used, and the specific characteristics of the task. Finally, we discuss challenges and identify potential directions to design active learning strategies for hybrid classification problems.
Styles APA, Harvard, Vancouver, ISO, etc.
23

Shiraishi, Yuhki, Jianwei Zhang, Daisuke Wakatsuki, Katsumi Kumai et Atsuyuki Morishima. « Crowdsourced real-time captioning of sign language by deaf and hard-of-hearing people ». International Journal of Pervasive Computing and Communications 13, no 1 (3 avril 2017) : 2–25. http://dx.doi.org/10.1108/ijpcc-02-2017-0014.

Texte intégral
Résumé :
Purpose The purpose of this paper is to explore the issues on how to achieve crowdsourced real-time captioning of sign language by deaf and hard-of-hearing (DHH) people, such that how a system structure should be designed, how a continuous task of sign language captioning should be divided into microtasks and how many DHH people are required to maintain a high-quality real-time captioning. Design/methodology/approach The authors first propose a system structure, including the new design of worker roles, task division and task assignment. Then, based on an implemented prototype, the authors analyze the necessary setting for achieving a crowdsourced real-time captioning of sign language, test the feasibility of the proposed system and explore its robustness and improvability through four experiments. Findings The results of Experiment 1 have revealed the optimal method for task division, the necessary minimum number of groups and the necessary minimum number of workers in a group. The results of Experiment 2 have verified the feasibility of the crowdsourced real-time captioning of sign language by DHH people. The results of Experiment 3 and Experiment 4 have shown the robustness and improvability of the captioning system. Originality/value Although some crowdsourcing-based systems have been developed for the captioning of voice to text, the authors intend to resolve the issues on the captioning of sign language to text, for which the existing approaches do not work well due to the unique properties of sign language. Moreover, DHH people are generally considered as the ones who receive support from others, but our proposal helps them become the ones who offer support to others.
Styles APA, Harvard, Vancouver, ISO, etc.
24

Trippas, Johanne R. « Spoken conversational search ». ACM SIGIR Forum 53, no 2 (décembre 2019) : 106–7. http://dx.doi.org/10.1145/3458553.3458570.

Texte intégral
Résumé :
Speech-based web search where no keyboard or screens are available to present search engine results is becoming ubiquitous, mainly through the use of mobile devices and intelligent assistants such as Apple's HomePod, Google Home, or Amazon Alexa. Currently, these intelligent assistants do not maintain a lengthy information exchange. They do not track context or present information suitable for an audio-only channel, and do not interact with the user in a multi-turn conversation. Understanding how users would interact with such an audio-only interaction system in multi-turn information seeking dialogues, and what users expect from these new systems, are unexplored in search settings. In particular, the knowledge on how to present search results over an audio-only channel and which interactions take place in this new search paradigm is crucial to incorporate while producing usable systems [9, 2, 8]. Thus, constructing insight into the conversational structure of information seeking processes provides researchers and developers opportunities to build better systems while creating a research agenda and directions for future advancements in Spoken Conversational Search (SCS). Such insight has been identified as crucial in the growing SCS area. At the moment, limited understanding has been acquired for SCS, for example, how the components interact, how information should be presented, or how task complexity impacts the interactivity or discourse behaviours. We aim to address these knowledge gaps. This thesis outlines the breadth of SCS and forms a manifesto advancing this highly interactive search paradigm with new research directions including prescriptive notions for implementing identified challenges [3]. We investigate SCS through quantitative and qualitative designs: (i) log and crowdsourcing experiments investigating different interaction and results presentation styles [1, 6], and (ii) the creation and analysis of the first SCS dataset and annotation schema through designing and conducting an observational study of information seeking dialogues [11, 5, 7]. We propose new research directions and design recommendations based on the triangulation of three different datasets and methods: the log analysis to identify practical challenges and limitations of existing systems while informing our future observational study; the crowdsourcing experiment to validate a new experimental setup for future search engine results presentation investigations; and the observational study to establish the SCS dataset (SCSdata), form the first Spoken Conversational Search Annotation Schema (SCoSAS), and study interaction behaviours for different task complexities. Our principle contributions are based on our observational study for which we developed a novel methodology utilising a qualitative design [10]. We show that existing information seeking models may be insufficient for the new SCS search paradigm because they inadequately capture meta-discourse functions and the system's role as an active agent. Thus, the results indicate that SCS systems have to support the user through discourse functions and be actively involved in the users' search process. This suggests that interactivity between the user and system is necessary to overcome the increased complexity which has been imposed upon the user and system by the constraints of the audio-only communication channel [4]. We then present the first schematic model for SCS which is derived from the SCoSAS through the qualitative analysis of the SCSdata. In addition, we demonstrate the applicability of our dataset by investigating the effect of task complexity on interaction and discourse behaviour. Lastly, we present SCS design recommendations and outline new research directions for SCS. The implications of our work are practical, conceptual, and methodological. The practical implications include the development of the SCSdata, the SCoSAS, and SCS design recommendations. The conceptual implications include the development of a schematic SCS model which identifies the need for increased interactivity and pro-activity to overcome the audio-imposed complexity in SCS. The methodological implications include the development of the crowdsourcing framework, and techniques for developing and analysing SCS datasets. In summary, we believe that our findings can guide researchers and developers to help improve existing interactive systems which are less constrained, such as mobile search, as well as more constrained systems such as SCS systems.
Styles APA, Harvard, Vancouver, ISO, etc.
25

Hasegawa-Johnson, Mark, Jennifer Cole, Preethi Jyothi et Lav R. Varshney. « Models of dataset size, question design, and cross-language speech perception for speech crowdsourcing applications ». Laboratory Phonology 6, no 3-4 (1 janvier 2015). http://dx.doi.org/10.1515/lp-2015-0012.

Texte intégral
Résumé :
AbstractTranscribers make mistakes. Workers recruited in a crowdsourcing marketplace, because of their varying levels of commitment and education, make more mistakes than workers in a controlled laboratory setting. Methods for compensating transcriber mistakes are desirable because, with such methods available, crowdsourcing has the potential to significantly increase the scale of experiments in laboratory phonology. This paper provides a brief tutorial on statistical learning theory, introducing the relationship between dataset size and estimation error, then presents a theoretical description and preliminary results for two new methods that control labeler error in laboratory phonology experiments. First, we discuss the method of crowdsourcing over error-correcting codes. In the error-correcting-code method, each difficult labeling task is first factored, by the experimenter, into the product of several easy labeling tasks (typically binary). Factoring increases the total number of tasks, nevertheless it results in faster completion and higher accuracy, because workers unable to perform the difficult task may be able to meaningfully contribute to the solution of each easy task. Second, we discuss the use of explicit mathematical models of the errors made by a worker in the crowd. In particular, we introduce the method of mismatched crowdsourcing, in which workers transcribe a language they do not understand, and an explicit mathematical model of second-language phoneme perception is used to learn and then compensate their transcription errors. Though introduced as technologies that increase the scale of phonology experiments, both methods have implications beyond increased scale. The method of easy questions permits us to probe the perception, by untrained listeners, of complicated phonological models; examples are provided from the prosody of English and Hindi. The method of mismatched crowdsourcing permits us to probe, in more detail than ever before, the perception of phonetic categories by listeners with a different phonological system.
Styles APA, Harvard, Vancouver, ISO, etc.
26

Ramírez, Jorge, Marcos Baez, Fabio Casati et Boualem Benatallah. « Crowdsourced dataset to study the generation and impact of text highlighting in classification tasks ». BMC Research Notes 12, no 1 (décembre 2019). http://dx.doi.org/10.1186/s13104-019-4858-z.

Texte intégral
Résumé :
Abstract Objectives Text classification is a recurrent goal in machine learning projects and a typical task in crowdsourcing platforms. Hybrid approaches, leveraging crowdsourcing and machine learning, work better than either in isolation and help to reduce crowdsourcing costs. One way to mix crowd and machine efforts is to have algorithms highlight passages from texts and feed these to the crowd for classification. In this paper, we present a dataset to study text highlighting generation and its impact on document classification. Data description The dataset was created through two series of experiments where we first asked workers to (i) classify documents according to a relevance question and to highlight parts of the text that supported their decision, and on a second phase, (ii) to assess document relevance but supported by text highlighting of varying quality (six human-generated and six machine-generated highlighting conditions). The dataset features documents from two application domains: systematic literature reviews and product reviews, three document sizes, and three relevance questions of different levels of difficulty. We expect this dataset of 27,711 individual judgments from 1851 workers to benefit not only this specific problem domain, but the larger class of classification problems where crowdsourced datasets with individual judgments are scarce.
Styles APA, Harvard, Vancouver, ISO, etc.
27

Li, Yu, Haonan Feng, Zhankui Peng, Li Zhou et Jian Wan. « Diversity-aware unmanned vehicle team arrangement in mobile crowdsourcing ». EURASIP Journal on Wireless Communications and Networking 2022, no 1 (23 juin 2022). http://dx.doi.org/10.1186/s13638-022-02139-x.

Texte intégral
Résumé :
AbstractWith the continuous development of mobile edge computing and the improvement of unmanned vehicle technology, unmanned vehicle could handle ever-increasing demands. As a significant application of unmanned vehicle, spatial crowdsourcing will provide an important application scenario, which is about to organize a lot of unmanned vehicle to conduct the spatial tasks by physically moving to its locations, called task assignment. Previous works usually focus on assigning a spatial task to one single vehicle or a group of vehicles. Few of them consider that vehicle team diversity is essential to collaborative work. Collaborative work is benefits from organizing teams with various backgrounds vehicles. In this paper, we consider a spatial crowdsourcing scenario. Each vehicle has a set of skills and a property. The property denotes vehicle’s special attribute (e.g., size, speed or weight). We introduce a concept of entropy to measure vehicle team diversity. Each spatial task (e.g., delivering the take-out, and carrying freight) is under the time and budget constraint, and required a set of skills. We need to assure that the assigned vehicle team is diverse. To address this issue, we first propose a practical problem, called team diversity spatial crowdsourcing (TD-SC) problem which finds an optimal team-and-task assignment strategy. Moreover, we design a framework which includes a greedy with diversity (GD) algorithm and a divide-and-conquer (D&C) algorithm to get team-and-task assignments. Finally, we demonstrate efficiency and effectiveness of the proposed methods through extensive experiments.
Styles APA, Harvard, Vancouver, ISO, etc.
28

Butyaev, Alexander, Chrisostomos Drogaris, Olivier Tremblay-Savard et Jérôme Waldispühl. « Human-supervised clustering of multidimensional data using crowdsourcing ». Royal Society Open Science 9, no 5 (mai 2022). http://dx.doi.org/10.1098/rsos.211189.

Texte intégral
Résumé :
Clustering is a central task in many data analysis applications. However, there is no universally accepted metric to decide the occurrence of clusters. Ultimately, we have to resort to a consensus between experts. The problem is amplified with high-dimensional datasets where classical distances become uninformative and the ability of humans to fully apprehend the distribution of the data is challenged. In this paper, we design a mobile human-computing game as a tool to query human perception for the multidimensional data clustering problem. We propose two clustering algorithms that partially or entirely rely on aggregated human answers and report the results of two experiments conducted on synthetic and real-world datasets. We show that our methods perform on par or better than the most popular automated clustering algorithms. Our results suggest that hybrid systems leveraging annotations of partial datasets collected through crowdsourcing platforms can be an efficient strategy to capture the collective wisdom for solving abstract computational problems.
Styles APA, Harvard, Vancouver, ISO, etc.
29

Moradi, Mohammad, et Mohammad Reza Keyvanpour. « CAPTCHA for crowdsourced image annotation : directions and efficiency analysis ». Aslib Journal of Information Management, 4 janvier 2022. http://dx.doi.org/10.1108/ajim-08-2021-0215.

Texte intégral
Résumé :
Purpose Image annotation plays an important role in image retrieval process, especially when it comes to content-based image retrieval. In order to compensate the intrinsic weakness of machines in performing cognitive task of (human-like) image annotation, leveraging humans’ knowledge and abilities in the form of crowdsourcing-based annotation have gained momentum. Among various approaches for this purpose, an innovative one is integrating the annotation process into the CAPTCHA workflow. In this paper, the current state of the research works in the field and experimental efficiency analysis of this approach are investigated. Design/methodology/approach At first, and with the aim of presenting a current state report of research studies in the field, a comprehensive literature review is provided. Then, several experiments and statistical analyses are conducted to investigate how CAPTCHA-based image annotation is reliable, accurate and efficient. Findings In addition to study of current trends and best practices for CAPTCHA-based image annotation, the experimental results demonstrated that despite some intrinsic limitations on leveraging the CAPTCHA as a crowdsourcing platform, when the challenge, i.e. annotation task, is selected and designed appropriately, the efficiency of CAPTCHA-based image annotation can outperform traditional approaches. Nonetheless, there are several design considerations that should be taken into account when the CAPTCHA is used as an image annotation platform. Originality/value To the best of the authors’ knowledge, this is the first study to analyze different aspects of the titular topic through exploration of the literature and experimental investigation. Therefore, it is anticipated that the outcomes of this study can draw a roadmap for not only CAPTCHA-based image annotation but also CAPTCHA-mediated crowdsourcing and even image annotation.
Styles APA, Harvard, Vancouver, ISO, etc.
30

Yasmin, Romena, Md Mahmudulla Hassan, Joshua T. Grassel, Harika Bhogaraju, Adolfo R. Escobedo et Olac Fuentes. « Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning ». Frontiers in Artificial Intelligence 5 (29 juin 2022). http://dx.doi.org/10.3389/frai.2022.848056.

Texte intégral
Résumé :
This work investigates how different forms of input elicitation obtained from crowdsourcing can be utilized to improve the quality of inferred labels for image classification tasks, where an image must be labeled as either positive or negative depending on the presence/absence of a specified object. Five types of input elicitation methods are tested: binary classification (positive or negative); the (x, y)-coordinate of the position participants believe a target object is located; level of confidence in binary response (on a scale from 0 to 100%); what participants believe the majority of the other participants' binary classification is; and participant's perceived difficulty level of the task (on a discrete scale). We design two crowdsourcing studies to test the performance of a variety of input elicitation methods and utilize data from over 300 participants. Various existing voting and machine learning (ML) methods are applied to make the best use of these inputs. In an effort to assess their performance on classification tasks of varying difficulty, a systematic synthetic image generation process is developed. Each generated image combines items from the MPEG-7 Core Experiment CE-Shape-1 Test Set into a single image using multiple parameters (e.g., density, transparency, etc.) and may or may not contain a target object. The difficulty of these images is validated by the performance of an automated image classification method. Experiment results suggest that more accurate results can be achieved with smaller training datasets when both the crowdsourced binary classification labels and the average of the self-reported confidence values in these labels are used as features for the ML classifiers. Moreover, when a relatively larger properly annotated dataset is available, in some cases augmenting these ML algorithms with the results (i.e., probability of outcome) from an automated classifier can achieve even higher performance than what can be obtained by using any one of the individual classifiers. Lastly, supplementary analysis of the collected data demonstrates that other performance metrics of interest, namely reduced false-negative rates, can be prioritized through special modifications of the proposed aggregation methods.
Styles APA, Harvard, Vancouver, ISO, etc.
31

Ahmed, Faez, John Dickerson et Mark Fuge. « Forming Diverse Teams From Sequentially Arriving People ». Journal of Mechanical Design 142, no 11 (22 mai 2020). http://dx.doi.org/10.1115/1.4046998.

Texte intégral
Résumé :
Abstract Collaborative work often benefits from having teams or organizations with heterogeneous members. In this paper, we present a method to form such diverse teams from people arriving sequentially over time. We define a monotone submodular objective function that combines the diversity and quality of a team and proposes an algorithm to maximize the objective while satisfying multiple constraints. This allows us to balance both how diverse the team is and how well it can perform the task at hand. Using crowd experiments, we show that, in practice, the algorithm leads to large gains in team diversity. Using simulations, we show how to quantify the additional cost of forming diverse teams and how to address the problem of simultaneously maximizing diversity for several attributes (e.g., country of origin and gender). Our method has applications in collaborative work ranging from team formation, the assignment of workers to teams in crowdsourcing, and reviewer allocation to journal papers arriving sequentially. Our code is publicly accessible for further research.
Styles APA, Harvard, Vancouver, ISO, etc.
32

Yan, Chengxi, Xuemei Tang, Hao Yang et Jun Wang. « A deep active learning-based and crowdsourcing-assisted solution for named entity recognition in Chinese historical corpora ». Aslib Journal of Information Management, 13 décembre 2022. http://dx.doi.org/10.1108/ajim-03-2022-0107.

Texte intégral
Résumé :
PurposeThe majority of existing studies about named entity recognition (NER) concentrate on the prediction enhancement of deep neural network (DNN)-based models themselves, but the issues about the scarcity of training corpus and the difficulty of annotation quality control are not fully solved, especially for Chinese ancient corpora. Therefore, designing a new integrated solution for Chinese historical NER, including automatic entity extraction and man-machine cooperative annotation, is quite valuable for improving the effectiveness of Chinese historical NER and fostering the development of low-resource information extraction.Design/methodology/approachThe research provides a systematic approach for Chinese historical NER with a three-stage framework. In addition to the stage of basic preprocessing, the authors create, retrain and yield a high-performance NER model only using limited labeled resources during the stage of augmented deep active learning (ADAL), which entails three steps—DNN-based NER modeling, hybrid pool-based sampling (HPS) based on the active learning (AL), and NER-oriented data augmentation (DA). ADAL is thought to have the capacity to maintain the performance of DNN as high as possible under the few-shot constraint. Then, to realize machine-aided quality control in crowdsourcing settings, the authors design a stage of globally-optimized automatic label consolidation (GALC). The core of GALC is a newly-designed label consolidation model called simulated annealing-based automatic label aggregation (“SA-ALC”), which incorporates the factors of worker reliability and global label estimation. The model can assure the annotation quality of those data from a crowdsourcing annotation system.FindingsExtensive experiments on two types of Chinese classical historical datasets show that the authors’ solution can effectively reduce the corpus dependency of a DNN-based NER model and alleviate the problem of label quality. Moreover, the results also show the superior performance of the authors’ pipeline approaches (i.e. HPS + DA and SA-ALC) compared to equivalent baselines in each stage.Originality/valueThe study sheds new light on the automatic extraction of Chinese historical entities in an all-technological-process integration. The solution is helpful to effectively reducing the annotation cost and controlling the labeling quality for the NER task. It can be further applied to similar tasks of information extraction and other low-resource fields in theoretical and practical ways.
Styles APA, Harvard, Vancouver, ISO, etc.
33

Mohan, Anuraj, Karthika P.V., Parvathi Sankar, Maya Manohar K. et Amala Peter. « Improving anti-money laundering in bitcoin using evolving graph convolutions and deep neural decision forest ». Data Technologies and Applications, 9 novembre 2022, 1–17. http://dx.doi.org/10.1108/dta-06-2021-0167.

Texte intégral
Résumé :
PurposeMoney laundering is the process of concealing unlawfully obtained funds by presenting them as coming from a legitimate source. Criminals use crypto money laundering to hide the illicit origin of funds using a variety of methods. The most simplified form of bitcoin money laundering leans hard on the fact that transactions made in cryptocurrencies are pseudonymous, but open data gives more power to investigators and enables the crowdsourcing of forensic analysis. With the motive to curb these illegal activities, there exist various rules, policies and technologies collectively known as anti-money laundering (AML) tools. When properly implemented, AML restrictions reduce the negative effects of illegal economic activity while also promoting financial market integrity and stability, but these bear high costs for institutions. The purpose of this work is to motivate the opportunity to reconcile the cause of safety with that of financial inclusion, bearing in mind the limitations of the available data. The authors use the Elliptic dataset; to the best of the authors' knowledge, this is the largest labelled transaction dataset publicly available in any cryptocurrency.Design/methodology/approachAML in bitcoin can be modelled as a node classification task in dynamic networks. In this work, graph convolutional decision forest will be introduced, which combines the potentialities of evolving graph convolutional network and deep neural decision forest (DNDF). This model will be used to classify the unknown transactions in the Elliptic dataset. Additionally, the application of knowledge distillation (KD) over the proposed approach gives finest results compared to all the other experimented techniques.FindingsThe importance of utilising a concatenation between dynamic graph learning and ensemble feature learning is demonstrated in this work. The results show the superiority of the proposed model to classify the illicit transactions in the Elliptic dataset. Experiments also show that the results can be further improved when the system is fine-tuned using a KD framework.Originality/valueExisting works used either ensemble learning or dynamic graph learning to tackle the problem of AML in bitcoin. The proposed model provides a novel view to combine the power of random forest with dynamic graph learning methods. Furthermore, the work also demonstrates the advantage of KD in improving the performance of the whole system.
Styles APA, Harvard, Vancouver, ISO, etc.
34

McQuillan, Dan. « The Countercultural Potential of Citizen Science ». M/C Journal 17, no 6 (12 octobre 2014). http://dx.doi.org/10.5204/mcj.919.

Texte intégral
Résumé :
What is the countercultural potential of citizen science? As a participant in the wider citizen science movement, I can attest that contemporary citizen science initiatives rarely characterise themselves as countercultural. Rather, the goal of most citizen science projects is to be seen as producing orthodox scientific knowledge: the ethos is respectability rather than rebellion (NERC). I will suggest instead that there are resonances with the counterculture that emerged in the 1960s, most visibly through an emphasis on participatory experimentation and the principles of environmental sustainability and social justice. This will be illustrated by example, through two citizen science projects that have a commitment to combining social values with scientific practice. I will then describe the explicitly countercultural organisation, Science for the People, which arose from within the scientific community itself, out of opposition to the Vietnam War. Methodological and conceptual weaknesses in the authoritative model of science are explored, suggesting that there is an opportunity for citizen science to become anti-hegemonic by challenging the hegemony of science itself. This reformulation will be expressed through Deleuze and Guattari's notion of nomadic science, the means through which citizen science could become countercultural. Counterculture Before examining the countercultural potential of citizen science, I set out some of the grounds for identifying a counterculture drawing on the ideas of Theodore Roszak, who invented the term counterculture to describe the new forms of youth movements that emerged in the 1960s (Roszak). This was a perspective that allowed the carnivalesque procession of beatniks, hippies and the New Left to be seen as a single paradigm shift combining psychic and social revolution. But just as striking and more often forgotten is the way Roszak characterised the role of the counterculture as mobilising a vital critique of the scientific worldview (Roszak 273-274). The concept of counterculture has been taken up in diverse ways since its original formation. We can draw, for example, on Lawrence Grossberg's more contemporary analysis of counterculture (Grossberg) to clarify the main concepts and contrast them with a scientific approach. Firstly, a counterculture works on and through cultural formations. This positions it as something the scientific community would see as the other, as the opposite to the objective, repeatable and quantitative truth-seeking of science. Secondly, a counterculture is a diverse and hybrid space without a unitary identity. Again, scientists would often see science as a singular activity applied in modulated forms depending on the context, although in practice the different sciences can experience each other as different tribes. Thirdly, a counterculture is lived as a transformative experience where the participant is fundamentally changed at a psychic level through participation in unique events. Contrast this with the scientific idea of the separation of observer and observed, and the objective repeatability of the experiment irrespective of the experimenter. Fourthly, a counterculture is associated with a unique moment in time, a point of shift from the old to the new. For the counterculture of the 1960s this was the Age of Aquarius. In general, the aim of science and scientists is to contribute to a form of truth that is essentially timeless, in that a physical law is assumed to hold across all time (and space), although science also has moments of radical change with regard to scientific paradigms. Finally, and significantly for the conclusions of this paper, according to Roszak a counterculture stands against the mainstream. It offers a challenge not at the level of detail but, to the fundamental assumptions of the status quo. This is what “science” cannot do, in as much as science itself has become the mainstream. It was the character of science as the bedrock of all values that Roszak himself opposed and for which he named and welcomed the counterculture. Although critical of some of the more shallow aspects of its psychedelic experimentation or political militancy, he shared its criticism of the technocratic society (the technocracy) and the egocentric mode of consciousness. His hope was that the counterculture could help restore a visionary imagination along with a more human sense of community. What Is Citizen Science? In recent years the concept of citizen science has grown massively in popularity, but is still an open and unstable term with many variants. Current moves towards institutionalisation (Citizen Science Association) are attempting to marry growth and stabilisation, with the first Annual General Meeting of the European Citizen Science Association securing a tentative agreement on the common principles of citizen science (Haklay, "European"). Key papers and presentations in the mainstream of the movement emphasise that citizen science is not a new activity (Bonney et al.) with much being made of the fact that the National Audubon Society started its annual Christmas Bird Count in 1900 (National Audubon Society). However, this elides the key role of the Internet in the current surge, which takes two distinct forms; the organisation of distributed fieldwork, and the online crowdsourcing of data analysis. To scientists, the appeal of citizen science fieldwork follows from its distributed character; they can research patterns over large scales and across latitudes in ways that would be impossible for a researcher at a single study site (Toomey). Gathering together the volunteer, observations are made possible by an infrastructure of web tools. The role of the citizen in this is to be a careful observer; the eyes and ears of the scientist in cyberspace. In online crowdsourcing, the internet is used to present pattern recognition tasks; enrolling users in searching images for signs of new planets or the jets of material from black holes. The growth of science crowdsourcing is exponential; one of the largest sites facilitating this kind of citizen science now has well in excess of a million registered users (Zooniverse). Such is the force of the technological aura around crowdsourced science that mainstream publications often conflate it with the whole of citizen science (Parr). There are projects within citizen science which share core values with the counterculture as originally defined by Roszak, in particular open participation and social justice. These projects also show characteristics from Grossberg's analysis of counterculture; they are diverse and hybrid spaces, carry a sense of moving from an old era to a new one, and have cultural forms of their own. They open up the full range of the scientific method to participation, including problem definition, research design, analysis and action. Citizen science projects that aim for participation in all these areas include the Extreme Citizen Science research group (ExCiteS) at University College London (UCL), the associated social enterprise Mapping for Change (Mapping for Change), and the Public Laboratory for Open Technology and Science (Public Lab). ExCiteS sees its version of citizen science as "a situated, bottom-up practice" that "takes into account local needs, practices and culture". Public Lab, meanwhile, argue that many citizen science projects only offer non-scientists token forms of participation in scientific inquiry that rarely amount to more that data collection and record keeping. They counter this through an open process which tries to involve communities all the way from framing the research questions, to prototyping tools, to collating and interpreting the measurements. ExCiteS and Public Lab also share an implicit commitment to social justice through scientific activity. The Public Lab mission is to "put scientific inquiry at the heart of civic life" and the UCL research group strive for "new devices and knowledge creation processes that can transform the world". All of their work is framed by environmental sustainability and care for the planet, whether it's enabling environmental monitoring by indigenous communities in the Congo (ExCiteS) or developing do-it-yourself spectrometry kits to detect crude oil pollution (Public Lab, "Homebrew"). Having provided a case for elements of countercultural DNA being present in bottom-up and problem-driven citizen science, we can contrast this with Science for the People, a scientific movement that was born out of the counterculture. Countercultural Science from the 1970s: Science for the People Science for the People (SftP) was a scientific movement seeded by a rebellion of young physicists against the role of US science in the Vietnam War. Young members of the American Physical Society (APS) lobbied for it to take a position against the war but were heavily criticised by other members, whose written complaints in the communications of the APS focused on the importance of scientific neutrality and the need to maintain the association's purely scientific nature rather than allowing science to become contaminated by politics (Sarah Bridger, in Plenary 2, 0:46 to 1:04). The counter-narrative from the dissidents argued that science is not neutral, invoking the example of Nazi science as a justification for taking a stand. After losing the internal vote the young radicals left to form Scientists and Engineers for Social and Political Action (SESPA), which later became Science for the People (SftP). As well as opposition to the Vietnam War, SftP embodied from the start other key themes of the counterculture, such as civil rights and feminism. For example, the first edition of Science for the People magazine (appearing as Vol. 2, No. 2 of the SESPA Newsletter) included an article about leading Black Panther, Bobby Seale, alongside a piece entitled “Women Demand Equality in Science.” The final articles in the same issue are indicators of SftP's dual approach to science and change; both the radicalisation of professionals (“Computer Professionals for Peace”) and the demystification of technical practices (“Statistics for the People”) (Science for the People). Science for the People was by no means just a magazine. For example, their technical assistance programme provided practical support to street health clinics run by the Black Panthers, and brought SftP under FBI surveillance (Herb Fox, in Plenary 1, 0:25 to 0:35). Both as a magazine and as a movement, SftP showed a tenacious longevity, with the publication being produced every two months between August 1970 and May/June 1989. It mutated through a network of affiliated local groups and international links, and was deeply involved in constructing early critiques of nuclear power and genetic determinism. SftP itself seems to have had a consistent commitment to non-hierarchical processes and, as one of the founders expressed it, a “shit kicking” approach to putting its principles in to practice (Al Weinrub, in Plenary 1, 0:25 to 0:35). SftP criticised power, front and centre. It is this opposition to hegemony that puts the “counter” into counterculture, and is missing from citizen science as currently practised. Cracks in the authority of orthodox science, which can be traced to both methodologies and basic concepts, follow in this paper. These can be seen as an opportunity for citizen science to directly challenge orthodox science and thus establish an anti-hegemonic stance of its own. Weaknesses of Scientific Hegemony In this section I argue that the weaknesses of scientific hegemony are in proportion to its claims to authority (Feyerabend). Through my scientific training as an experimental particle physicist I have participated in many discussions about the ontological and epistemological grounds for scientific authority. While most scientists choose to present their practice publicly as an infallible machine for the production of truths, the opinions behind the curtain are far more mixed. Physicist Lee Somolin has written a devastating critique of science-in-practice that focuses on the capture of the institutional economy of science by an ideological grouping of string theorists (Smolin), and his account is replete with questions about science itself and ethnographic details that bring to life the messy behind-the-scenes conflicts in scientific-knowledge making. Knowledge of this messiness has prompted some citizen science advocates to take science to task, for example for demanding higher standards in data consistency from citizen science than is often the case in orthodox science (Haklay, "Assertions"; Freitag, "Good Science"). Scientists will also and invariably refer to reproducibility as the basis for the authority of scientific truths. The principle that the same experiments always get the same results, irrespective of who is doing the experiment, and as long as they follow the same method, is a foundation of scientific objectivity. However, a 2012 study of landmark results in cancer science was able to reproduce only 11 per cent of the original findings (Begley and Ellis). While this may be an outlier case, there are broader issues with statistics and falsification, a bias on positive results, weaknesses in peer review and the “publish or perish” academic culture (The Economist). While the pressures are all-too-human, the resulting distortions are rarely acknowledged in public by scientists themselves. On the other hand, citizen science has been slow to pick up the gauntlet. For example, while some scientists involved in citizen science have commented on the inequality and inappropriateness of orthodox peer review for citizen science papers (Freitag, “What Is the Role”) there has been no direct challenge to any significant part of the scientific edifice. I argue that the nearest thing to a real challenge to orthodox science is the proposal for a post-normal science, which pre-dates the current wave of citizen science. Post-normal science tries to accommodate the philosophical implications of post-structuralism and at the same time position science to tackle problems such as climate change, intractable to reproducibility (Funtowicz and Ravetz). It accomplishes this by extending the domains in which science can provide meaningful answers to include issues such as global warming, which involve high decision stakes and high uncertainty. It extends traditional peer review into an extended peer community, which includes all the stakeholders in an issue, and may involve active research as well as quality assessment. The idea of extended peer review has obvious overlaps with community-oriented citizen science, but has yet to be widely mobilised as a theoretical buttress for citizen-led science. Prior even to post-normal science are the potential cracks in the core philosophy of science. In her book Cosmopolitics, Isabelle Stengers characterises the essential nature of scientific truth as the ability to disqualify and exclude other truth claims. This, she asserts, is the hegemony of physics and its singular claim to decide what is real and what is true. Stengers traces this, in part, to the confrontation more than one hundred years ago between Max Planck and Ernst Mach, whereas the latter argued that claims to an absolute truth should be replaced by formulations that tied physical laws to the human practices that produced them. Planck stood firmly for knowledge forms that were unbounded by time, space or specific social-material procedures (Stengers). Although contemporary understandings of science are based on Planck's version, citizen science has the potential to re-open these questions in a productive manner for its own practices, if it can re-conceive of itself as what Deleuze and Guattari would call nomadic science (Deleuze; Deleuze & Guattari). Citizen Science as Nomadic Science Deleuze and Guattari referred to orthodox science as Royal Science or Striated Science, referring in part to its state-like form of authority and practice, as well as its psycho-social character. Their alternative is a smooth or nomadic science that, importantly for citizen science, does not have the ambition to totalise knowledge. Nomadic science is a form of empirical investigation that has no need to be hooked up to a grand narrative. The concept of nomadic science is a natural fit for bottom-up citizen science because it can valorise truths that are non-dual and that go beyond objectivity to include the experiential. In this sense it is like the extended peer review of post-normal science but without the need to be limited to high-risk high-stakes questions. As there is no a priori problem with provisional knowledges, it naturally inclines towards the local, the situated and the culturally reflective. The apparent unreliability of citizen science in terms of participants and tools, which is solely a source of anxiety, can become heuristic for nomadic science when re-cast through the forgotten alternatives like Mach's formulation; that truths are never separated from the specifics of the context and process that produced them (Stengers 6-18; 223). Nomadic science, I believe, will start to emerge through projects that are prepared to tackle toxic epistemology as much as toxic pollutants. For example, the Community Based Auditing (CBA) developed by environmental activists in Tasmania (Tattersall) challenges local alliances of state and extractive industries by undermining their own truth claims with regards to environmental impact, a process described in the CBA Toolbox as disconfirmation. In CBA, this mixture of post-normal science and Stenger's critique is combined with forms of data collection and analysis known as Community Based Sampling (Tattersall et al.), which would be recognisable to any citizen science project. The change from citizen science to nomadic science is not a total rupture but a shift in the starting point: it is based on an overt critique of power. One way to bring this about is being tested in the “Kosovo Science for Change” project (Science for Change Kosovo), where I am a researcher and where we have adopted the critical pedagogy of Paulo Freire as the starting point for our empirical investigations (Freire). Critical pedagogy is learning as the co-operative activity of understanding—how our lived experience is constructed by power, and how to make a difference in the world. Taking a position such as nomadic science, openly critical of Royal Science, is the anti-hegemonic stance that could qualify citizen science as properly countercultural. Citizen Science and Counterculture Counterculture, as I have expressed it, stands against or rejects the hegemonic culture. However, there is a strong tendency in contemporary social movements to take a stance not only against the dominant structures but against hegemony itself. They contest what Richard Day calls the hegemony of hegemony (Day). I witnessed this during the counter-G8 mobilisation of 2001. Having been an activist in the 1980s and 1990s I was wearily familiar with the sectarian competitiveness of various radical narratives, each seeking to establish itself as the correct path. So it was a strongly affective experience to stand in the convergence centre and listen to so many divergent social groups and movements agree to support each other's tactics, expressing a solidarity based on a non-judgemental pluralism. Since then we have seen the emergence of similarly anti-hegemonic countercultures around the Occupy and Anonymous movements. It is in this context of counterculture that I will try to summarise and evaluate the countercultural potential of citizen science and what being countercultural might offer to citizen science itself. To be countercultural it is not enough for citizen science to counterpose participation against the institutional and hierarchical aspects of professional science. As an activity defined purely by engagement it offers to plug the legitimacy gap for science while still being wholly dependent on it. A countercultural citizen science must pose a strong challenge to the status quo, and I have suggested that a route to this would be to develop as nomadic science. This does not mean replacing or overthrowing science but constructing an other to science with its own claim to empirical methods. It is fair to ask what this would offer citizen science that it does not already have. At an abstract level it would gain a freedom of movement; an ability to occupy Deleuzian smooth spaces rather than be constrained by the striation of established science. The founders of Science for the People are clear that it could never have existed if it had not been able to draw on the mass movements of its time. Being countercultural would give citizen science an affinity with the bottom-up, local and community-based issues where empirical methods are likely to have the most social impact. One of many examples is the movement against fracking (the hydraulic fracturing of deep rock formations to release shale gas). Together, these benefits of being countercultural open up the possibility for forms of citizen science to spread rhizomatically in a way that is not about immaterial virtual labour but is itself part of a wider cultural change. The possibility of a nomadic science stands as a doorway to the change that Roszak saw at the heart of the counterculture, a renewal of the visionary imagination. References Begley, C. Glenn, and Lee M. Ellis. "Drug Development: Raise Standards for Preclinical Cancer Research." Nature 483.7391 (2012): 531–533. 8 Oct. 2014 ‹http://www.nature.com/nature/journal/v483/n7391/full/483531a.html›. Bonney, Rick, et al. "Citizen Science: A Developing Tool for Expanding Science Knowledge and Scientific Literacy." BioScience 59.11 (2009): 977–984. 6 Oct. 2014 ‹http://bioscience.oxfordjournals.org/content/59/11/977›. Citizen Science Association. "Citizen Science Association." 2014. 6 Oct. 2014 ‹http://citizenscienceassociation.org/›. Day, Richard J.F. Gramsci Is Dead: Anarchist Currents in the Newest Social Movements. London: Pluto Press, 2005. Deleuze, Giles. Nomadology: The War Machine. New York, NY: MIT Press, 1986. Deleuze, Gilles, and Felix Guattari. A Thousand Plateaus. London: Bloomsbury Academic, 2013. ExCiteS. "From Non-Literate Data Collection to Intelligent Maps." 26 Aug. 2013. 8 Oct. 2014 ‹http://www.ucl.ac.uk/excites/projects/excites-projects/intelligent-maps/intelligent-maps›. Feyerabend, Paul K. Against Method. 4th ed. London: Verso, 2010. Freire, Paulo. Pedagogy of the Oppressed. Continuum International Publishing Group, 2000. Freitag, Amy. "Good Science and Bad Science in Democratized Science." Oceanspaces 22 Jan. 2014. 9 Oct. 2014 ‹http://oceanspaces.org/blog/good-science-and-bad-science-democratized-science›. ---. "What Is the Role of Peer-Reviewed Literature in Citizen Science?" Oceanspaces 29 Jan. 2014. 10 Oct. 2014 ‹http://oceanspaces.org/blog/what-role-peer-reviewed-literature-citizen-science›. Funtowicz, Silvio O., and Jerome R. Ravetz. "Science for the Post-Normal Age." Futures 25.7 (1993): 739–755. 8 Oct. 2014 ‹http://www.sciencedirect.com/science/article/pii/001632879390022L›. Grossberg, Lawrence. "Some Preliminary Conjunctural Thoughts on Countercultures." Journal of Gender and Power 1.1 (2014). 3 Nov. 2014 ‹http://gender-power.amu.edu.pl/?page_id=20›. Haklay, Muki. "Assertions on Crowdsourced Geographic Information & Citizen Science #2." Po Ve Sham - Muki Haklay’s Personal Blog 16 Jan. 2014. 8 Oct. 2014 ‹http://povesham.wordpress.com/2014/01/16/assertions-on-crowdsourced-geographic-information-citizen-science-2/›. ---. "European Citizen Science Association Suggestion for 10 Principles of Citizen Science." Po Ve Sham - Muki Haklay’s Personal Blog 14 May 2014. 6 Oct. 2014 ‹http://povesham.wordpress.com/2014/05/14/european-citizen-science-association-suggestion-for-10-principles-of-citizen-science/›. Mapping for Change. "Mapping for Change." 2014. 6 June 2014 ‹http://www.mappingforchange.org.uk/›. National Audubon Society. "Christmas Bird Count." 2014. 6 Oct. 2014 ‹http://birds.audubon.org/christmas-bird-count›. NERC. "Best Practice Guides to Choosing and Using Citizen Science for Environmental Projects." Centre for Ecology & Hydrology May 2014. 9 Oct. 2014 ‹http://www.ceh.ac.uk/products/publications/understanding-citizen-science.html›. Parr, Chris. "Why Citizen Scientists Help and How to Keep Them Hooked." Times Higher Education 6 June 2013. 6 Oct. 2014 ‹http://www.timeshighereducation.co.uk/news/why-citizen-scientists-help-and-how-to-keep-them-hooked/2004321.article›. Plenary 1: Stories from the Movement. Film. Science for the People, 2014. Plenary 2: The History and Lasting Significance of Science for the People. Film. Science for the People, 2014. Public Lab. "Public Lab: A DIY Environmental Science Community." 2014. 6 June 2014 ‹http://publiclab.org/›. ---. "The Homebrew Oil Testing Kit." Kickstarter 24 Sep. 2014. 8 Oct. 2014 ‹https://www.kickstarter.com/projects/publiclab/the-homebrew-oil-testing-kit›. Roszak, Theodore. The Making of a Counter Culture. Garden City, N.Y.: Anchor Books/Doubleday, 1969. Science for Change Kosovo. "Citizen Science Kosovo." Facebook, n.d. 17 Aug. 2014 ‹https://www.facebook.com/CitSciKS›. Science for the People. "SftP Magazine." 2013. 8 Oct. 2014 ‹http://science-for-the-people.org/sftp-resources/magazine/›. Smolin, Lee. The Trouble with Physics: The Rise of String Theory, the Fall of a Science, and What Comes Next. Reprint ed. Boston: Mariner Books, 2007. Stengers, Isabelle. Cosmopolitics I. Trans. Robert Bononno. Minneapolis: U of Minnesota P, 2010. Tattersall, Philip J. "What Is Community Based Auditing and How Does It Work?." Futures 42.5 (2010): 466–474. 9 Oct. 2014 ‹http://www.sciencedirect.com/science/article/pii/S0016328709002055›. ---, Kim Eastman, and Tasmanian Community Resource Auditors. Community Based Auditing: Tool Boxes: Training and Support Guides. Beauty Point, Tas.: Resource Publications, 2010. The Economist. "Trouble at the Lab." 19 Oct. 2013. 8 Oct. 2014 ‹http://www.economist.com/news/briefing/21588057-scientists-think-science-self-correcting-alarming-degree-it-not-trouble›. Toomey, Diane. "How Rise of Citizen Science Is Democratizing Research." 28 Jan. 2014. 6 Oct. 2014 ‹http://e360.yale.edu/feature/interview_caren_cooper_how_rise_of_citizen_science_is_democratizing_research/2733/›. UCL. "Extreme Citizen Science (ExCiteS)." July 2013. 6 June 2014 ‹http://www.ucl.ac.uk/excites/›. Zooniverse. "The Ever-Expanding Zooniverse - Updated." Daily Zooniverse 3 Feb. 2014. 6 Oct. 2014 ‹http://daily.zooniverse.org/2014/02/03/the-ever-expanding-zooniverse-updated/›.
Styles APA, Harvard, Vancouver, ISO, etc.
Nous offrons des réductions sur tous les plans premium pour les auteurs dont les œuvres sont incluses dans des sélections littéraires thématiques. Contactez-nous pour obtenir un code promo unique!

Vers la bibliographie