Liste as mudanças inesperadamente após a atribuição. Como faço para clonar ou copiar para evitar isso?

Apr 10 2010

Durante o uso new_list = my_list, qualquer modificação nas new_listalterações my_listsempre. Por que isso ocorre e como posso clonar ou copiar a lista para evitá-lo?

Respostas

3501 FelixKling Apr 10 2010 at 15:55

Com new_list = my_list, você não tem realmente duas listas. A atribuição apenas copia a referência para a lista, não a lista real, portanto, ambos new_liste my_listreferem-se à mesma lista após a atribuição.

Para realmente copiar a lista, você tem várias possibilidades:

  • Você pode usar o list.copy()método integrado (disponível desde Python 3.3):

    new_list = old_list.copy()
    
  • Você pode cortá-lo:

    new_list = old_list[:]
    

    A opinião de Alex Martelli (pelo menos em 2007 ) sobre isso é que é uma sintaxe estranha e não faz sentido usá-la nunca . ;) (Em sua opinião, o próximo é mais legível).

  • Você pode usar a list()função integrada :

    new_list = list(old_list)
    
  • Você pode usar genéricos copy.copy():

    import copy
    new_list = copy.copy(old_list)
    

    Isso é um pouco mais lento do que list()porque primeiro tem que descobrir o tipo de dados old_list.

  • Se a lista contiver objetos e você também quiser copiá-los, use o genérico copy.deepcopy():

    import copy
    new_list = copy.deepcopy(old_list)
    

    Obviamente, o método mais lento e que mais precisa de memória, mas às vezes inevitável.

Exemplo:

import copy

class Foo(object):
    def __init__(self, val):
         self.val = val

    def __repr__(self):
        return 'Foo({!r})'.format(self.val)

foo = Foo(1)

a = ['foo', foo]
b = a.copy()
c = a[:]
d = list(a)
e = copy.copy(a)
f = copy.deepcopy(a)

# edit orignal list and instance 
a.append('baz')
foo.val = 5

print('original: %r\nlist.copy(): %r\nslice: %r\nlist(): %r\ncopy: %r\ndeepcopy: %r'
      % (a, b, c, d, e, f))

Resultado:

original: ['foo', Foo(5), 'baz']
list.copy(): ['foo', Foo(5)]
slice: ['foo', Foo(5)]
list(): ['foo', Foo(5)]
copy: ['foo', Foo(5)]
deepcopy: ['foo', Foo(1)]
644 cryo Apr 10 2010 at 17:16

Felix já deu uma resposta excelente, mas pensei em fazer uma comparação de velocidade dos vários métodos:

  1. 10,59 s (105,9us / itn) - copy.deepcopy(old_list)
  2. 10,16 seg (101,6us / itn) - Copy()classes de cópia do método Python puro com deepcopy
  3. 1,488 seg (14,88us / itn) - Copy()método python puro não copiando classes (apenas dicts / lists / tuples)
  4. 0,325 seg (3,25us / itn) - for item in old_list: new_list.append(item)
  5. 0,217 seg (2,17us / itn) - [i for i in old_list](uma compreensão de lista )
  6. 0,186 s (1,86us / itn) - copy.copy(old_list)
  7. 0,075 s (0,75us / itn) - list(old_list)
  8. 0,053 s (0,53us / itn) - new_list = []; new_list.extend(old_list)
  9. 0,039 seg (0,39us / itn) - old_list[:]( divisão de lista )

Portanto, o mais rápido é o fatiamento de listas. Mas esteja ciente de que copy.copy(), list[:]e list(list), ao contrário copy.deepcopy()da versão python, não copie nenhuma lista, dicionário e instância de classe na lista, portanto, se os originais forem alterados, eles também serão alterados na lista copiada e vice-versa.

(Aqui está o roteiro, se alguém estiver interessado ou quiser levantar alguma questão :)

from copy import deepcopy

class old_class:
    def __init__(self):
        self.blah = 'blah'

class new_class(object):
    def __init__(self):
        self.blah = 'blah'

dignore = {str: None, unicode: None, int: None, type(None): None}

def Copy(obj, use_deepcopy=True):
    t = type(obj)

    if t in (list, tuple):
        if t == tuple:
            # Convert to a list if a tuple to 
            # allow assigning to when copying
            is_tuple = True
            obj = list(obj)
        else: 
            # Otherwise just do a quick slice copy
            obj = obj[:]
            is_tuple = False

        # Copy each item recursively
        for x in xrange(len(obj)):
            if type(obj[x]) in dignore:
                continue
            obj[x] = Copy(obj[x], use_deepcopy)

        if is_tuple: 
            # Convert back into a tuple again
            obj = tuple(obj)

    elif t == dict: 
        # Use the fast shallow dict copy() method and copy any 
        # values which aren't immutable (like lists, dicts etc)
        obj = obj.copy()
        for k in obj:
            if type(obj[k]) in dignore:
                continue
            obj[k] = Copy(obj[k], use_deepcopy)

    elif t in dignore: 
        # Numeric or string/unicode? 
        # It's immutable, so ignore it!
        pass 

    elif use_deepcopy: 
        obj = deepcopy(obj)
    return obj

if __name__ == '__main__':
    import copy
    from time import time

    num_times = 100000
    L = [None, 'blah', 1, 543.4532, 
         ['foo'], ('bar',), {'blah': 'blah'},
         old_class(), new_class()]

    t = time()
    for i in xrange(num_times):
        Copy(L)
    print 'Custom Copy:', time()-t

    t = time()
    for i in xrange(num_times):
        Copy(L, use_deepcopy=False)
    print 'Custom Copy Only Copying Lists/Tuples/Dicts (no classes):', time()-t

    t = time()
    for i in xrange(num_times):
        copy.copy(L)
    print 'copy.copy:', time()-t

    t = time()
    for i in xrange(num_times):
        copy.deepcopy(L)
    print 'copy.deepcopy:', time()-t

    t = time()
    for i in xrange(num_times):
        L[:]
    print 'list slicing [:]:', time()-t

    t = time()
    for i in xrange(num_times):
        list(L)
    print 'list(L):', time()-t

    t = time()
    for i in xrange(num_times):
        [i for i in L]
    print 'list expression(L):', time()-t

    t = time()
    for i in xrange(num_times):
        a = []
        a.extend(L)
    print 'list extend:', time()-t

    t = time()
    for i in xrange(num_times):
        a = []
        for y in L:
            a.append(y)
    print 'list append:', time()-t

    t = time()
    for i in xrange(num_times):
        a = []
        a.extend(i for i in L)
    print 'generator expression extend:', time()-t
160 anatolytechtonik Jul 23 2013 at 19:32

Já foi dito que o Python 3.3+ adicionalist.copy() método, que deve ser tão rápido como corte:

newlist = old_list.copy()

133 AaronHall Oct 25 2014 at 19:13

Quais são as opções para clonar ou copiar uma lista em Python?

No Python 3, uma cópia superficial pode ser feita com:

a_copy = a_list.copy()

No Python 2 e 3, você pode obter uma cópia superficial com uma parte completa do original:

a_copy = a_list[:]

Explicação

Existem duas maneiras semânticas de copiar uma lista. Uma cópia superficial cria uma nova lista dos mesmos objetos, uma cópia profunda cria uma nova lista contendo novos objetos equivalentes.

Cópia da lista rasa

Uma cópia superficial copia apenas a própria lista, que é um contêiner de referências aos objetos na lista. Se os objetos contidos em si forem mutáveis ​​e um for alterado, a alteração será refletida em ambas as listas.

Existem diferentes maneiras de fazer isso no Python 2 e 3. As formas do Python 2 também funcionam no Python 3.

Python 2

No Python 2, a maneira idiomática de fazer uma cópia superficial de uma lista é com uma fatia completa do original:

a_copy = a_list[:]

Você também pode fazer a mesma coisa passando a lista pelo construtor de lista,

a_copy = list(a_list)

mas usar o construtor é menos eficiente:

>>> timeit
>>> l = range(20)
>>> min(timeit.repeat(lambda: l[:]))
0.30504298210144043
>>> min(timeit.repeat(lambda: list(l)))
0.40698814392089844

Python 3

No Python 3, as listas obtêm o list.copymétodo:

a_copy = a_list.copy()

Em Python 3.5:

>>> import timeit
>>> l = list(range(20))
>>> min(timeit.repeat(lambda: l[:]))
0.38448613602668047
>>> min(timeit.repeat(lambda: list(l)))
0.6309100328944623
>>> min(timeit.repeat(lambda: l.copy()))
0.38122922903858125

Fazer outro ponteiro não faz uma cópia

Usando new_list = my_list então modifica new_list toda vez que my_list muda. Por que é isso?

my_listé apenas um nome que aponta para a lista real na memória. Quando você diz new_list = my_listque não está fazendo uma cópia, está apenas adicionando outro nome que aponta para a lista original na memória. Podemos ter problemas semelhantes quando fazemos cópias de listas.

>>> l = [[], [], []]
>>> l_copy = l[:]
>>> l_copy
[[], [], []]
>>> l_copy[0].append('foo')
>>> l_copy
[['foo'], [], []]
>>> l
[['foo'], [], []]

A lista é apenas uma matriz de ponteiros para o conteúdo, portanto, uma cópia superficial apenas copia os ponteiros e, portanto, você tem duas listas diferentes, mas elas têm o mesmo conteúdo. Para fazer cópias do conteúdo, você precisa de uma cópia em profundidade.

Cópias profundas

Para fazer uma cópia detalhada de uma lista, no Python 2 ou 3, use deepcopyno copymódulo :

import copy
a_deep_copy = copy.deepcopy(a_list)

Para demonstrar como isso nos permite fazer novas sublistas:

>>> import copy
>>> l
[['foo'], [], []]
>>> l_deep_copy = copy.deepcopy(l)
>>> l_deep_copy[0].pop()
'foo'
>>> l_deep_copy
[[], [], []]
>>> l
[['foo'], [], []]

E assim vemos que a lista copiada em profundidade é uma lista totalmente diferente da original. Você poderia rolar sua própria função - mas não o faça. É provável que você crie bugs que, de outra forma, não criaria usando a função deepcopy da biblioteca padrão.

Não use eval

Você pode ver isso usado como uma forma de deepcopy, mas não faça isso:

problematic_deep_copy = eval(repr(a_list))
  1. É perigoso, principalmente se você estiver avaliando algo de uma fonte em que não confia.
  2. It's not reliable, if a subelement you're copying doesn't have a representation that can be eval'd to reproduce an equivalent element.
  3. It's also less performant.

In 64 bit Python 2.7:

>>> import timeit
>>> import copy
>>> l = range(10)
>>> min(timeit.repeat(lambda: copy.deepcopy(l)))
27.55826997756958
>>> min(timeit.repeat(lambda: eval(repr(l))))
29.04534101486206

on 64 bit Python 3.5:

>>> import timeit
>>> import copy
>>> l = list(range(10))
>>> min(timeit.repeat(lambda: copy.deepcopy(l)))
16.84255409205798
>>> min(timeit.repeat(lambda: eval(repr(l))))
34.813894678023644
57 jack Nov 23 2014 at 23:45

There are many answers already that tell you how to make a proper copy, but none of them say why your original 'copy' failed.

Python doesn't store values in variables; it binds names to objects. Your original assignment took the object referred to by my_list and bound it to new_list as well. No matter which name you use there is still only one list, so changes made when referring to it as my_list will persist when referring to it as new_list. Each of the other answers to this question give you different ways of creating a new object to bind to new_list.

Each element of a list acts like a name, in that each element binds non-exclusively to an object. A shallow copy creates a new list whose elements bind to the same objects as before.

new_list = list(my_list)  # or my_list[:], but I prefer this syntax
# is simply a shorter way of:
new_list = [element for element in my_list]

To take your list copy one step further, copy each object that your list refers to, and bind those element copies to a new list.

import copy  
# each element must have __copy__ defined for this...
new_list = [copy.copy(element) for element in my_list]

This is not yet a deep copy, because each element of a list may refer to other objects, just like the list is bound to its elements. To recursively copy every element in the list, and then each other object referred to by each element, and so on: perform a deep copy.

import copy
# each element must have __deepcopy__ defined for this...
new_list = copy.deepcopy(my_list)

See the documentation for more information about corner cases in copying.

46 AadityaUra Nov 13 2017 at 14:04

Let's start from the beginning and explore this question.

So let's suppose you have two lists:

list_1=['01','98']
list_2=[['01','98']]

And we have to copy both lists, now starting from the first list:

So first let's try by setting the variable copy to our original list, list_1:

copy=list_1

Now if you are thinking copy copied the list_1, then you are wrong. The id function can show us if two variables can point to the same object. Let's try this:

print(id(copy))
print(id(list_1))

The output is:

4329485320
4329485320

Both variables are the exact same argument. Are you surprised?

So as we know python doesn't store anything in a variable, Variables are just referencing to the object and object store the value. Here object is a list but we created two references to that same object by two different variable names. This means that both variables are pointing to the same object, just with different names.

When you do copy=list_1, it is actually doing:

Here in the image list_1 and copy are two variable names but the object is same for both variable which is list

So if you try to modify copied list then it will modify the original list too because the list is only one there, you will modify that list no matter you do from the copied list or from the original list:

copy[0]="modify"

print(copy)
print(list_1)

output:

['modify', '98']
['modify', '98']

So it modified the original list :

Now let's move onto a pythonic method for copying lists.

copy_1=list_1[:]

This method fixes the first issue we had:

print(id(copy_1))
print(id(list_1))

4338792136
4338791432

So as we can see our both list having different id and it means that both variables are pointing to different objects. So what actually going on here is:

Now let's try to modify the list and let's see if we still face the previous problem:

copy_1[0]="modify"

print(list_1)
print(copy_1)

The output is:

['01', '98']
['modify', '98']

As you can see, it only modified the copied list. That means it worked.

Do you think we're done? No. Let's try to copy our nested list.

copy_2=list_2[:]

list_2 should reference to another object which is copy of list_2. Let's check:

print(id((list_2)),id(copy_2))

We get the output:

4330403592 4330403528

Now we can assume both lists are pointing different object, so now let's try to modify it and let's see it is giving what we want:

copy_2[0][1]="modify"

print(list_2,copy_2)

This gives us the output:

[['01', 'modify']] [['01', 'modify']]

This may seem a little bit confusing, because the same method we previously used worked. Let's try to understand this.

When you do:

copy_2=list_2[:]

You're only copying the outer list, not the inside list. We can use the id function once again to check this.

print(id(copy_2[0]))
print(id(list_2[0]))

The output is:

4329485832
4329485832

When we do copy_2=list_2[:], this happens:

It creates the copy of list but only outer list copy, not the nested list copy, nested list is same for both variable, so if you try to modify the nested list then it will modify the original list too as the nested list object is same for both lists.

What is the solution? The solution is the deepcopy function.

from copy import deepcopy
deep=deepcopy(list_2)

Let's check this:

print(id((list_2)),id(deep))

4322146056 4322148040

Both outer lists have different IDs, let's try this on the inner nested lists.

print(id(deep[0]))
print(id(list_2[0]))

The output is:

4322145992
4322145800

As you can see both IDs are different, meaning we can assume that both nested lists are pointing different object now.

This means when you do deep=deepcopy(list_2) what actually happens:

Both nested lists are pointing different object and they have separate copy of nested list now.

Now let's try to modify the nested list and see if it solved the previous issue or not:

deep[0][1]="modify"
print(list_2,deep)

It outputs:

[['01', '98']] [['01', 'modify']]

As you can see, it didn't modify the original nested list, it only modified the copied list.

38 PaulTarjan Apr 10 2010 at 15:53

Use thing[:]

>>> a = [1,2]
>>> b = a[:]
>>> a += [3]
>>> a
[1, 2, 3]
>>> b
[1, 2]
>>> 
37 River Apr 05 2017 at 08:01

Python 3.6 Timings

Here are the timing results using Python 3.6.8. Keep in mind these times are relative to one another, not absolute.

I stuck to only doing shallow copies, and also added some new methods that weren't possible in Python2, such as list.copy() (the Python3 slice equivalent) and two forms of list unpacking (*new_list, = list and new_list = [*list]):

METHOD                  TIME TAKEN
b = [*a]                2.75180600000021
b = a * 1               3.50215399999990
b = a[:]                3.78278899999986  # Python2 winner (see above)
b = a.copy()            4.20556500000020  # Python3 "slice equivalent" (see above)
b = []; b.extend(a)     4.68069800000012
b = a[0:len(a)]         6.84498999999959
*b, = a                 7.54031799999984
b = list(a)             7.75815899999997
b = [i for i in a]      18.4886440000000
b = copy.copy(a)        18.8254879999999
b = []
for item in a:
  b.append(item)        35.4729199999997

We can see the Python2 winner still does well, but doesn't edge out Python3 list.copy() by much, especially considering the superior readability of the latter.

The dark horse is the unpacking and repacking method (b = [*a]), which is ~25% faster than raw slicing, and more than twice as fast as the other unpacking method (*b, = a).

b = a * 1 also does surprisingly well.

Note that these methods do not output equivalent results for any input other than lists. They all work for sliceable objects, a few work for any iterable, but only copy.copy() works for more general Python objects.


Here is the testing code for interested parties (Template from here):

import timeit

COUNT = 50000000
print("Array duplicating. Tests run", COUNT, "times")
setup = 'a = [0,1,2,3,4,5,6,7,8,9]; import copy'

print("b = list(a)\t\t", timeit.timeit(stmt='b = list(a)', setup=setup, number=COUNT))
print("b = copy.copy(a)\t", timeit.timeit(stmt='b = copy.copy(a)', setup=setup, number=COUNT))
print("b = a.copy()\t\t", timeit.timeit(stmt='b = a.copy()', setup=setup, number=COUNT))
print("b = a[:]\t\t", timeit.timeit(stmt='b = a[:]', setup=setup, number=COUNT))
print("b = a[0:len(a)]\t\t", timeit.timeit(stmt='b = a[0:len(a)]', setup=setup, number=COUNT))
print("*b, = a\t\t\t", timeit.timeit(stmt='*b, = a', setup=setup, number=COUNT))
print("b = []; b.extend(a)\t", timeit.timeit(stmt='b = []; b.extend(a)', setup=setup, number=COUNT))
print("b = []; for item in a: b.append(item)\t", timeit.timeit(stmt='b = []\nfor item in a:  b.append(item)', setup=setup, number=COUNT))
print("b = [i for i in a]\t", timeit.timeit(stmt='b = [i for i in a]', setup=setup, number=COUNT))
print("b = [*a]\t\t", timeit.timeit(stmt='b = [*a]', setup=setup, number=COUNT))
print("b = a * 1\t\t", timeit.timeit(stmt='b = a * 1', setup=setup, number=COUNT))
34 erisco Apr 10 2010 at 15:53

Python's idiom for doing this is newList = oldList[:]

20 AMR Jul 10 2015 at 10:51

All of the other contributors gave great answers, which work when you have a single dimension (leveled) list, however of the methods mentioned so far, only copy.deepcopy() works to clone/copy a list and not have it point to the nested list objects when you are working with multidimensional, nested lists (list of lists). While Felix Kling refers to it in his answer, there is a little bit more to the issue and possibly a workaround using built-ins that might prove a faster alternative to deepcopy.

While new_list = old_list[:], copy.copy(old_list)' and for Py3k old_list.copy() work for single-leveled lists, they revert to pointing at the list objects nested within the old_list and the new_list, and changes to one of the list objects are perpetuated in the other.

Edit: New information brought to light

As was pointed out by both Aaron Hall and PM 2Ring using eval() is not only a bad idea, it is also much slower than copy.deepcopy().

This means that for multidimensional lists, the only option is copy.deepcopy(). With that being said, it really isn't an option as the performance goes way south when you try to use it on a moderately sized multidimensional array. I tried to timeit using a 42x42 array, not unheard of or even that large for bioinformatics applications, and I gave up on waiting for a response and just started typing my edit to this post.

It would seem that the only real option then is to initialize multiple lists and work on them independently. If anyone has any other suggestions, for how to handle multidimensional list copying, it would be appreciated.

As others have stated, there are significant performance issues using the copy module and copy.deepcopy for multidimensional lists.

14 SCB Feb 26 2018 at 09:33

It surprises me that this hasn't been mentioned yet, so for the sake of completeness...

You can perform list unpacking with the "splat operator": *, which will also copy elements of your list.

old_list = [1, 2, 3]

new_list = [*old_list]

new_list.append(4)
old_list == [1, 2, 3]
new_list == [1, 2, 3, 4]

The obvious downside to this method is that it is only available in Python 3.5+.

Timing wise though, this appears to perform better than other common methods.

x = [random.random() for _ in range(1000)]

%timeit a = list(x)
%timeit a = x.copy()
%timeit a = x[:]

%timeit a = [*x]

#: 2.47 µs ± 38.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
#: 2.47 µs ± 54.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
#: 2.39 µs ± 58.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

#: 2.22 µs ± 43.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
8 jainashish Nov 01 2017 at 15:08

A very simple approach independent of python version was missing in already given answers which you can use most of the time (at least I do):

new_list = my_list * 1       #Solution 1 when you are not using nested lists

However, If my_list contains other containers (for eg. nested lists) you must use deepcopy as others suggested in the answers above from the copy library. For example:

import copy
new_list = copy.deepcopy(my_list)   #Solution 2 when you are using nested lists

.Bonus: If you don't want to copy elements use (aka shallow copy):

new_list = my_list[:]

Let's understand difference between Solution#1 and Solution #2

>>> a = range(5)
>>> b = a*1
>>> a,b
([0, 1, 2, 3, 4], [0, 1, 2, 3, 4])
>>> a[2] = 55 
>>> a,b
([0, 1, 55, 3, 4], [0, 1, 2, 3, 4])

As you can see Solution #1 worked perfectly when we were not using the nested lists. Let's check what will happen when we apply solution #1 to nested lists.

>>> from copy import deepcopy
>>> a = [range(i,i+4) for i in range(3)]
>>> a
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
>>> b = a*1
>>> c = deepcopy(a)
>>> for i in (a, b, c): print i   
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
>>> a[2].append('99')
>>> for i in (a, b, c): print i   
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5, 99]]
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5, 99]]   #Solution#1 didn't work in nested list
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]       #Solution #2 - DeepCopy worked in nested list
8 Chris_Rands May 16 2018 at 21:31

Note that there are some cases where if you have defined your own custom class and you want to keep the attributes then you should use copy.copy() or copy.deepcopy() rather than the alternatives, for example in Python 3:

import copy

class MyList(list):
    pass

lst = MyList([1,2,3])

lst.name = 'custom list'

d = {
'original': lst,
'slicecopy' : lst[:],
'lstcopy' : lst.copy(),
'copycopy': copy.copy(lst),
'deepcopy': copy.deepcopy(lst)
}


for k,v in d.items():
    print('lst: {}'.format(k), end=', ')
    try:
        name = v.name
    except AttributeError:
        name = 'NA'
    print('name: {}'.format(name))

Outputs:

lst: original, name: custom list
lst: slicecopy, name: NA
lst: lstcopy, name: NA
lst: copycopy, name: custom list
lst: deepcopy, name: custom list
5 RaviShankar Jun 27 2017 at 04:03
new_list = my_list[:]

new_list = my_list Try to understand this. Let's say that my_list is in the heap memory at location X i.e. my_list is pointing to the X. Now by assigning new_list = my_list you're Letting new_list pointing to the X. This is known as shallow Copy.

Now if you assign new_list = my_list[:] You're simply copying each object of my_list to new_list. This is known as Deep copy.

The Other way you can do this are :

  • new_list = list(old_list)
  • import copy new_list = copy.deepcopy(old_list)
3 Corman Sep 08 2019 at 09:25

I wanted to post something a bit different then some of the other answers. Even though this is most likely not the most understandable, or fastest option, it provides a bit of an inside view of how deep copy works, as well as being another alternative option for deep copying. It doesn't really matter if my function has bugs, since the point of this is to show a way to copy objects like the question answers, but also to use this as a point to explain how deepcopy works at its core.

At the core of any deep copy function is way to make a shallow copy. How? Simple. Any deep copy function only duplicates the containers of immutable objects. When you deepcopy a nested list, you are only duplicating the outer lists, not the mutable objects inside of the lists. You are only duplicating the containers. The same works for classes, too. When you deepcopy a class, you deepcopy all of its mutable attributes. So, how? How come you only have to copy the containers, like lists, dicts, tuples, iters, classes, and class instances?

It's simple. A mutable object can't really be duplicated. It can never be changed, so it is only a single value. That means you never have to duplicate strings, numbers, bools, or any of those. But how would you duplicate the containers? Simple. You make just initialize a new container with all of the values. Deepcopy relies on recursion. It duplicates all the containers, even ones with containers inside of them, until no containers are left. A container is an immutable object.

Once you know that, completely duplicating an object without any references is pretty easy. Here's a function for deepcopying basic data-types (wouldn't work for custom classes but you could always add that)

def deepcopy(x):
  immutables = (str, int, bool, float)
  mutables = (list, dict, tuple)
  if isinstance(x, immutables):
    return x
  elif isinstance(x, mutables):
    if isinstance(x, tuple):
      return tuple(deepcopy(list(x)))
    elif isinstance(x, list):
      return [deepcopy(y) for y in x]
    elif isinstance(x, dict):
      values = [deepcopy(y) for y in list(x.values())]
      keys = list(x.keys())
      return dict(zip(keys, values))

Python's own built-in deepcopy is based around that example. The only difference is it supports other types, and also supports user-classes by duplicating the attributes into a new duplicate class, and also blocks infinite-recursion with a reference to an object it's already seen using a memo list or dictionary. And that's really it for making deep copies. At its core, making a deep copy is just making shallow copies. I hope this answer adds something to the question.

EXAMPLES

Say you have this list: [1, 2, 3]. The immutable numbers cannot be duplicated, but the other layer can. You can duplicate it using a list comprehension: [x for x in [1, 2, 3]

Now, imagine you have this list: [[1, 2], [3, 4], [5, 6]]. This time, you want to make a function, which uses recursion to deep copy all layers of the list. Instead of the previous list comprehension:

[x for x in _list]

It uses a new one for lists:

[deepcopy_list(x) for x in _list]

And deepcopy_list looks like this:

def deepcopy_list(x):
  if isinstance(x, (str, bool, float, int)):
    return x
  else:
    return [deepcopy_list(y) for y in x]

Then now you have a function which can deepcopy any list of strs, bools, floast, ints and even lists to infinitely many layers using recursion. And there you have it, deepcopying.

TLDR: Deepcopy uses recursion to duplicate objects, and merely returns the same immutable objects as before, as immutable objects cannot be duplicated. However, it deepcopies the most inner layers of mutable objects until it reaches the outermost mutable layer of an object.

3 B.Mr.W. Nov 24 2019 at 02:01

A slight practical perspective to look into memory through id and gc.

>>> b = a = ['hell', 'word']
>>> c = ['hell', 'word']

>>> id(a), id(b), id(c)
(4424020872, 4424020872, 4423979272) 
     |           |
      -----------

>>> id(a[0]), id(b[0]), id(c[0])
(4424018328, 4424018328, 4424018328) # all referring to same 'hell'
     |           |           |
      -----------------------

>>> id(a[0][0]), id(b[0][0]), id(c[0][0])
(4422785208, 4422785208, 4422785208) # all referring to same 'h'
     |           |           |
      -----------------------

>>> a[0] += 'o'
>>> a,b,c
(['hello', 'word'], ['hello', 'word'], ['hell', 'word'])  # b changed too
>>> id(a[0]), id(b[0]), id(c[0])
(4424018384, 4424018384, 4424018328) # augmented assignment changed a[0],b[0]
     |           |
      -----------

>>> b = a = ['hell', 'word']
>>> id(a[0]), id(b[0]), id(c[0])
(4424018328, 4424018328, 4424018328) # the same hell
     |           |           |
      -----------------------

>>> import gc
>>> gc.get_referrers(a[0]) 
[['hell', 'word'], ['hell', 'word']]  # one copy belong to a,b, the another for c
>>> gc.get_referrers(('hell'))
[['hell', 'word'], ['hell', 'word'], ('hell', None)] # ('hello', None) 
3 Dr.Hippo Feb 22 2020 at 19:44

Remember that in Python when you do:

    list1 = ['apples','bananas','pineapples']
    list2 = list1

List2 isn't storing the actual list, but a reference to list1. So when you do anything to list1, list2 changes as well. use the copy module (not default, download on pip) to make an original copy of the list(copy.copy() for simple lists, copy.deepcopy() for nested ones). This makes a copy that doesn't change with the first list.

1 RoshinRaphel Jun 04 2020 at 17:40

This is because, the line new_list = my_list assigns a new reference to the variable my_list which is new_list This is similar to the C code given below,

int my_list[] = [1,2,3,4];
int *new_list;
new_list = my_list;

You should use the copy module to create a new list by

import copy
new_list = copy.deepcopy(my_list)
shahar_m Apr 11 2020 at 18:19

The deepcopy option is the only method that works for me:

from copy import deepcopy

a = [   [ list(range(1, 3)) for i in range(3) ]   ]
b = deepcopy(a)
b[0][1]=[3]
print('Deep:')
print(a)
print(b)
print('-----------------------------')
a = [   [ list(range(1, 3)) for i in range(3) ]   ]
b = a*1
b[0][1]=[3]
print('*1:')
print(a)
print(b)
print('-----------------------------')
a = [   [ list(range(1, 3)) for i in range(3) ] ]
b = a[:]
b[0][1]=[3]
print('Vector copy:')
print(a)
print(b)
print('-----------------------------')
a = [   [ list(range(1, 3)) for i in range(3) ]  ]
b = list(a)
b[0][1]=[3]
print('List copy:')
print(a)
print(b)
print('-----------------------------')
a = [   [ list(range(1, 3)) for i in range(3) ]  ]
b = a.copy()
b[0][1]=[3]
print('.copy():')
print(a)
print(b)
print('-----------------------------')
a = [   [ list(range(1, 3)) for i in range(3) ]  ]
b = a
b[0][1]=[3]
print('Shallow:')
print(a)
print(b)
print('-----------------------------')

leads to output of:

Deep:
[[[1, 2], [1, 2], [1, 2]]]
[[[1, 2], [3], [1, 2]]]
-----------------------------
*1:
[[[1, 2], [3], [1, 2]]]
[[[1, 2], [3], [1, 2]]]
-----------------------------
Vector copy:
[[[1, 2], [3], [1, 2]]]
[[[1, 2], [3], [1, 2]]]
-----------------------------
List copy:
[[[1, 2], [3], [1, 2]]]
[[[1, 2], [3], [1, 2]]]
-----------------------------
.copy():
[[[1, 2], [3], [1, 2]]]
[[[1, 2], [3], [1, 2]]]
-----------------------------
Shallow:
[[[1, 2], [3], [1, 2]]]
[[[1, 2], [3], [1, 2]]]
-----------------------------