Come avviare e interrompere un thread di lavoro

Jan 03 2021

Ho un seguente requisito che è standard in altri linguaggi di programmazione ma non so come fare in Rust.

Ho una classe, voglio scrivere un metodo per generare un thread di lavoro che soddisfi 2 condizioni:

Dopo aver generato il thread di lavoro, la funzione viene restituita (quindi non è necessario attendere altro posto)
C'è un meccanismo per fermare questo thread.

Ad esempio, ecco il mio codice fittizio:

struct A {
    thread: JoinHandle<?>,
}

impl A {
    pub fn run(&mut self) -> Result<()>{
        self.thread = thread::spawn(move || {
            let mut i = 0;
            loop {
                self.call();
                i = 1 + i;
                if i > 5 {
                    return
                }
            }
        });
        Ok(())
    }

    pub fn stop(&mut self) -> std::thread::Result<_> {
        self.thread.join()
    }

    pub fn call(&mut self) {
        println!("hello world");
    }
}

fn main() {
    let mut a = A{};
    a.run();
}

Ho un errore in thread: JoinHandle<?>. Qual è il tipo di thread in questo caso. E il mio codice è corretto per avviare e arrestare un thread di lavoro?

Risposte

3 vallentin Jan 03 2021 at 02:07

Insomma, l' Tin join()su a JoinHandlerestituisce il risultato della chiusura passata a thread::spawn(). Quindi nel tuo caso JoinHandle<?>dovrebbe essere JoinHandle<()>come la tua chiusura non restituisce nulla , cioè ()(unità) .

Oltre a questo, il tuo codice fittizio contiene alcuni problemi aggiuntivi.

Il tipo restituito di run()non è corretto e dovrebbe almeno esserlo Result<(), ()>.
Il threadcampo dovrebbe essere Option<JoinHandle<()>per essere in grado di gestire fn stop(&mut self) come join()consuma il JoinHandle.
Tuttavia, stai tentando di passare &mut selfalla chiusura, il che porta molti più problemi, riducendosi a più riferimenti mutabili
- Questo potrebbe essere risolto ad es Mutex<A>. Tuttavia, se chiami, stop()ciò potrebbe invece portare a un deadlock.

Tuttavia, poiché era un codice fittizio, e hai chiarito nei commenti. Vorrei provare a chiarire cosa intendevi con alcuni esempi. Ciò include la riscrittura del tuo codice fittizio.

Risultato dopo che il lavoratore ha finito

Se non è necessario accedere ai dati mentre il thread di lavoro è in esecuzione, è possibile crearne uno nuovo struct WorkerData. Quindi run()copia / clona i dati da cui hai bisogno A(o come l'ho rinominato Worker). Poi nella chiusura finalmente ritorni di datanuovo, così puoi acquisirlo attraverso join().

use std::thread::{self, JoinHandle};

struct WorkerData {
    ...
}

impl WorkerData {
    pub fn call(&mut self) {
        println!("hello world");
    }
}

struct Worker {
    thread: Option<JoinHandle<WorkerData>>,
}

impl Worker {
    pub fn new() -> Self {
        Self { thread: None }
    }

    pub fn run(&mut self) {
        // Create `WorkerData` and copy/clone whatever is needed from `self`
        let mut data = WorkerData {};

        self.thread = Some(thread::spawn(move || {
            let mut i = 0;
            loop {
                data.call();
                i = 1 + i;
                if i > 5 {
                    // Return `data` so we get in through `join()`
                    return data;
                }
            }
        }));
    }

    pub fn stop(&mut self) -> Option<thread::Result<WorkerData>> {
        if let Some(handle) = self.thread.take() {
            Some(handle.join())
        } else {
            None
        }
    }
}

Non hai davvero bisogno threaddi essere Option<JoinHandle<WorkerData>>e invece potresti semplicemente usare JoinHandle<WorkerData>>. Perché se volessi chiamare di run()nuovo, sarebbe solo più facile riassegnare la variabile che contiene Worker.

Quindi ora possiamo semplificare Worker, rimuovendo Optione cambiando stopper consumare threadinvece, oltre a creare new() -> Selfal posto di run(&mut self).

use std::thread::{self, JoinHandle};

struct Worker {
    thread: JoinHandle<WorkerData>,
}

impl Worker {
    pub fn new() -> Self {
        // Create `WorkerData` and copy/clone whatever is needed from `self`
        let mut data = WorkerData {};

        let thread = thread::spawn(move || {
            let mut i = 0;
            loop {
                data.call();
                i = 1 + i;
                if i > 5 {
                    return data;
                }
            }
        });

        Self { thread }
    }

    pub fn stop(self) -> thread::Result<WorkerData> {
        self.thread.join()
    }
}

Condivisa `WorkerData`

Se vuoi mantenere i riferimenti WorkerDatatra più thread, devi usare Arc. Dal momento che vuoi anche essere in grado di modificarlo, dovrai usare un file Mutex.

Se muti solo all'interno di un singolo thread, potresti alternativamente utilizzare un RwLock, che rispetto a a Mutexti consentirà di bloccare e ottenere più riferimenti immutabili contemporaneamente.

use std::sync::{Arc, RwLock};
use std::thread::{self, JoinHandle};

struct Worker {
    thread: JoinHandle<()>,
    data: Arc<RwLock<WorkerData>>,
}

impl Worker {
    pub fn new() -> Self {
        // Create `WorkerData` and copy/clone whatever is needed from `self`
        let data = Arc::new(RwLock::new(WorkerData {}));

        let thread = thread::spawn({
            let data = data.clone();
            move || {
                let mut i = 0;
                loop {
                    if let Ok(mut data) = data.write() {
                        data.call();
                    }

                    i = 1 + i;
                    if i > 5 {
                        return;
                    }
                }
            }
        });

        Self { thread, data }
    }

    pub fn stop(self) -> thread::Result<Arc<RwLock<WorkerData>>> {
        self.thread.join()?;
        // You might be able to unwrap and get the inner `WorkerData` here
        Ok(self.data)
    }
}

Se aggiungi un metodo per poterlo ottenere datasotto forma di Arc<RwLock<WorkerData>>. Quindi se si clona il Arce lo si blocca (l'interno RwLock) prima di chiamare stop(), ciò comporterebbe un deadlock. Per evitare ciò, qualsiasi data()metodo dovrebbe restituire &WorkerDatao &mut WorkerDatainvece di Arc. In questo modo non sarai in grado di chiamare stop()e causerai un deadlock.

Bandiera per fermare il lavoratore

Se si desidera effettivamente interrompere il thread di lavoro, è necessario utilizzare un flag per segnalargli di farlo. Puoi creare un flag sotto forma di condiviso AtomicBool.

use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::{Arc, RwLock};
use std::thread::{self, JoinHandle};

struct Worker {
    thread: JoinHandle<()>,
    data: Arc<RwLock<WorkerData>>,
    stop_flag: Arc<AtomicBool>,
}

impl Worker {
    pub fn new() -> Self {
        // Create `WorkerData` and copy/clone whatever is needed from `self`
        let data = Arc::new(RwLock::new(WorkerData {}));

        let stop_flag = Arc::new(AtomicBool::new(false));

        let thread = thread::spawn({
            let data = data.clone();
            let stop_flag = stop_flag.clone();
            move || {
                // let mut i = 0;
                loop {
                    if stop_flag.load(Ordering::Relaxed) {
                        break;
                    }

                    if let Ok(mut data) = data.write() {
                        data.call();
                    }

                    // i = 1 + i;
                    // if i > 5 {
                    //     return;
                    // }
                }
            }
        });

        Self {
            thread,
            data,
            stop_flag,
        }
    }

    pub fn stop(self) -> thread::Result<Arc<RwLock<WorkerData>>> {
        self.stop_flag.store(true, Ordering::Relaxed);
        self.thread.join()?;
        // You might be able to unwrap and get the inner `WorkerData` here
        Ok(self.data)
    }
}

Più thread e più attività

Se desideri elaborare più tipi di attività, distribuite su più thread, ecco un esempio più generalizzato.

Hai già menzionato l'utilizzo mpsc. Quindi puoi usare un Sendere Receiverinsieme a un custom Taske un TaskResultenum.

use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::mpsc::{self, Receiver, Sender};
use std::sync::{Arc, Mutex};
use std::thread::{self, JoinHandle};

pub enum Task {
    ...
}

pub enum TaskResult {
    ...
}

pub type TaskSender = Sender<Task>;
pub type TaskReceiver = Receiver<Task>;

pub type ResultSender = Sender<TaskResult>;
pub type ResultReceiver = Receiver<TaskResult>;

struct Worker {
    threads: Vec<JoinHandle<()>>,
    task_sender: TaskSender,
    result_receiver: ResultReceiver,
    stop_flag: Arc<AtomicBool>,
}

impl Worker {
    pub fn new(num_threads: usize) -> Self {
        let (task_sender, task_receiver) = mpsc::channel();
        let (result_sender, result_receiver) = mpsc::channel();

        let task_receiver = Arc::new(Mutex::new(task_receiver));

        let stop_flag = Arc::new(AtomicBool::new(false));

        Self {
            threads: (0..num_threads)
                .map(|_| {
                    let task_receiver = task_receiver.clone();
                    let result_sender = result_sender.clone();
                    let stop_flag = stop_flag.clone();

                    thread::spawn(move || loop {
                        if stop_flag.load(Ordering::Relaxed) {
                            break;
                        }

                        let task_receiver = task_receiver.lock().unwrap();

                        if let Ok(task) = task_receiver.recv() {
                            drop(task_receiver);

                            // Perform the `task` here

                            // If the `Task` results in a `TaskResult` then create it and send it back
                            let result: TaskResult = ...;
                            // The `SendError` can be ignored as it only occurs if the receiver
                            // has already been deallocated
                            let _ = result_sender.send(result);
                        } else {
                            break;
                        }
                    })
                })
                .collect(),
            task_sender,
            result_receiver,
            stop_flag,
        }
    }

    pub fn stop(self) -> Vec<thread::Result<()>> {
        drop(self.task_sender);

        self.stop_flag.store(true, Ordering::Relaxed);

        self.threads
            .into_iter()
            .map(|t| t.join())
            .collect::<Vec<_>>()
    }

    #[inline]
    pub fn request(&mut self, task: Task) {
        self.task_sender.send(task).unwrap();
    }

    #[inline]
    pub fn result_receiver(&mut self) -> &ResultReceiver {
        &self.result_receiver
    }
}

Un esempio di utilizzo di Workerinsieme all'invio di attività e alla ricezione dei risultati dell'attività sarebbe quindi simile a questo:

fn main() {
    let mut worker = Worker::new(4);

    // Request that a `Task` is performed
    worker.request(task);

    // Receive a `TaskResult` if any are pending
    if let Ok(result) = worker.result_receiver().try_recv() {
        // Process the `TaskResult`
    }
}

In alcuni casi potrebbe essere necessario implementare Sendper Taske / o TaskResult. Controlla "Capire il tratto Invia" .

unsafe impl Send for Task {}
unsafe impl Send for TaskResult {}

1 ddulaney Jan 03 2021 at 00:45

Il parametro di tipo di a JoinHandledovrebbe essere il tipo restituito dalla funzione del thread.

In questo caso, il tipo restituito è una tupla vuota (), pronunciata unità . Viene utilizzato quando è possibile un solo valore ed è il "tipo restituito" implicito delle funzioni quando non viene specificato alcun tipo restituito.

Puoi semplicemente scrivere JoinHandle<()>per rappresentare che la funzione non restituirà nulla.

(Nota: il tuo codice incorrerà in alcuni problemi con il controllo del prestito self.call(), che probabilmente dovranno essere risolti con Arc<Mutex<Self>>, ma questa è un'altra domanda.)