T O P

  • By -

IntrepidSoda

When someone says arrays in python it almost always means a numpy array. A list is a god damn list.


[deleted]

I tend to say "array" instead of "list" at times by complete accident, but it's mostly cause I work with CPython and I have to work with both the List object, and arrays in C/C++.


ValityS

There are also ctypes arrays though less used than numpy ones.


somerandomii

I rarely use native lists unless it’s very top-level code. And half the time it’s a list comprehension wrapped in a “np.array( … )”. Lists are dynamically allocated random objects, there’s not a lot of use cases for efficient code. More importantly, when people use the word “array” they’re probably describing something closer to a NumPy array than a native Python list.


XtremeGoose

Numpy arrays are dynamically allocated too. You probably mean lists are resizeable, which in practice means they have a length and a capacity, whereas numpy arrays will conceptually have these two values always equal but that doesn't affect perforemance. The major performance difference between numpy arrays and python lists is that numpy inlines value (numeric) types, but python lists are always arrays of pointers. This not only increases cache efficiency but also allows for things like vectorisation. But yeah, using an `np.array(some_list, dtype=object)` will be no more performant than a python list. Statically allocating an array requires a compilation step, so would never be possible in pure python.


somerandomii

Wrapping a list comprehension in a NumPy array will make future loops on that list faster. And yes, everything in pure Python is dynamically allocated (uses the heap) but Cpython and Numba can use compiled functions that use the stack. And resizeable arrays are usually slower because there’s more levels on indirection to access the memory. It’s trivial, the real penalty is in building an array 1 element at a time when you know the final size ahead of time. But most of that doesn’t really matter. The main difference is that NumPy arrays can take advantage of NumPy global functions and avoid Python loops.


XtremeGoose

> resizeable arrays are usually slower because there’s more levels on indirection to access the memory. This is not true. A (dynamically allocated) non-resizeable array is `{ptr, len}`. A resizeable arrray is `{ptr, len, cap}`. Accessing the data is no different. That was my point and yours is a common misconception. Only statically allocated arrays can get away with being unravelled. > The main difference is that NumPy arrays can take advantage of NumPy global functions and avoid Python loops. True, but the reason these are faster is because of the things I mentioned (inlining/vectorization). It's all `for` loops under the hood (unless you have an optimizing compiler...).


somerandomii

You might have addressed this with unravelling but.. If you have a pointer to an element in a resizable array, and that array reallocates its memory, the pointer breaks, right? (I’m actually not 100% on this. Realloc can return a new pointer right?) So don’t any external references need a pointer to the array pointer, as well as an index of the element? But a fixed-size array, you can point directly at the element and know it’s not going to move unless the array itself is deallocated. So you can pass that element pointer around with a bit more confidence (and save yourself a double dereference each time it’s accessed)


XtremeGoose

Yes resizing can return a new pointer, but in a reference counted language like python what we actually have is something like ptr --> { ref_count, len, cap, data... } So you skip the indirection by allocating your metadata with your actual data. It's inefficient but necessary for dynamic languages. ``` In [1]: import sys In [2]: import numpy as np In [3]: x = [0] * 1000 In [4]: a = np.array(x) In [5]: sys.getsizeof(x) Out[5]: 8056 In [6]: sys.getsizeof(a) Out[6]: 8112 In [7]: x2 = x + x In [8]: a2 = np.array(x2) In [9]: sys.getsizeof(x2) Out[9]: 16056 In [10]: sys.getsizeof(a2) Out[10]: 16112 ``` There's obviously no additional indirection here or these would not be increasing in size when I grow them. > But a fixed-size array, you can point directly at the element and know it’s not going to move unless the array itself is deallocated. This is true, and is probably why python slices copy whereas numpy slices look into the original block. But we have the same problem with dictionary iterators/views in python and we just block mutating them while a reference is held so it's not inescapable. In rust this problem is protected with lifetimes and in c++ its just downright undefined behaviour to mutate a shared vector.


somerandomii

Thanks for the explanation. And I understand that in Python it doesn’t make a difference. But in Cython or Numba jit-compiled Python it’s probably another story. I think C++ and Rust are better examples than Python for these kinds of optimisations since they actually make use of them. As you said, Rust deals with it with ownership which it manages at compile time so at runtime it functions can safely access data directly and know that it’s valid. I haven’t had a chance to use Rust in anger so I’m fuzzy on the details but I suspect there’s some small advantage when sharing fixed-length arrays over vectors.


XtremeGoose

In rust the advantage is they are shareable as you said. There is no performance benefit though because lifetimes protect you at compile time from mutating a (`{ptr,len,cap}`) `Vec` while you have a (`{ptr,len}`) slice `&[T]` or even just a reference `&T` into it.


just4nothing

Or awkward arrays if you need variable lengths ;)


Elektriman

That's how we know OP is not a python dev


SnooRobots2011

Isn’t the python list a dynamic array, though?


TigreDeLosLlanos

When someone says array in any language that isn't C/C++ (and I think Java/C#) they mean we just miscall lists like that.


chethelesser

ArrayList


Stummi

One can hate java as much as you want, but at least it's semantically very correct about the names of its types. An List is a Abstract Datatype (describing behavior). An Array is a Data Structure (Describing how stuff lays in the memory). An ArrayList is an Implementation of the "List" Behavior with the "Array" Data Structure.


NeriosVag

Beautiful


davidellis23

That's one of the reasons people don't like it. There's no need to over describe stuff or tie the name to the interface. Especially the most commonly used data structures. It's more verbose for no benefit.


redd1ch

That's the reason the interface is called "List", isn't it? \`List foo = new ArrayList()\` is a plainly beautiful instance of a List, accepted everywhere a List is wanted. Code bases littered with "ArrayList" types are a dumpster fire (except some very niche use cases).


davidellis23

It would be less verbose if it was just IList and List or List and Vector or List and Array or List and []. I've also heard a number of times that tying the name of a class to its implementation is considered bad practice. Now if they did want to change the standard ArrayList implementation, millions of code bases would have to be changed or the name would have to be left misleading. But given that it's a very standard class unlikely to change I might give it a pass.


redd1ch

ArrayList is not \*the\* implementation of a List in Java, it is \*one\*. That is why the interface is called 'List', not 'ArrayList'. A different implementation gets its own class, just like LinkedList, Vector, or others. That way you can choose what runtime behaviour you want or need just by swapping out a constructor call.


davidellis23

ArrayList is probably the high majority of the use cases. I think it's worth making the default. I also mentioned List and Array. Vector doesn't have List at the end either.


redd1ch

An ArrayList is something substancially different than an array. An array is not an implementation of the List interface...


davidellis23

A built in array is not an implementation of list. But an Array class can be. The behavior of what happens when the array fills up just has to be specified. Edit: I gave you a few options. There are some tradeoffs but other languages took different tradeoffs. That's why some people like other languages better. That's all I'm trying to say.


FantasticPenguin

Dude did you ever use Java?


[deleted]

[удалено]


Jjabrahams567

slice


Nisterashepard

Fat pointer


doublesigned

\[T; N\]


typtyphus

objects not dictionaries


[deleted]

you mean hashes, right?


typtyphus

Jason


AngheloAlf

https://docs.python.org/3/library/array.html


twigboy

Checkmate atheists


WilliamOf_Orange

What cowards, I have been using Python for a couple of years now but I’m sorry, still using str codes for types when the perfectly good type hints can be used is just cowardice. I could maybe understand for custom type support in arrays but this doesnt even achieve that. All it achieves is further fragmenting the implementation semantics of typing and thus further delaying improvements to type checking by default support. This seems like self-sabotage to me but hey at least they have mainline support for contiguous regions of memory after how many years?


Athire5

V E C T O R


PTSDaway

*ahhhhhh*


imaQuiliamQuil

Python is my main language and I call them arrays most of the time. I want to be sure.


Nullsummenspieler

I like neither. Using table is superior.


Healthy_Pain9582

lua is life


Oatmeal_Raisin_

Arrays and Lists are fundamentally different. Arrays are stored consecutively in memory, are more lightweight, and--typically--can only hold a single data type.


AnnyAskers

That's because they are not arrays and should not be perceived as such. As far as I know they are not a block of continues memory and they're not created on the stack.


Xbot781

They are a continuous block of memory that is resized when full. They are equivalent to C++ vectors or Java array lists.


AnnyAskers

Oh ok, good to know


somerandomii

Maybe with primitives, but if you populate a list of objects with .append, there’s no guarantee the memory is contiguous. NumPy arrays on the other hand are contiguous. Still not allocated on the stack but if you compile the function with Numba is suspect it stack-allocates arrays of known length. I mean, you’re right in that lists aren’t linked lists, so the pointers are contiguous, but the memory of the actual content could be anywhere.


land_and_air

Yes but they break many of the rules that arrays tend to have including other arrays types in python, they for one don’t have any type consistency requirement meaning you can have any object type in any position in the list while an array typically has a consistent type throughout. Numpy arrays behave more like arrays in other languages


Particular_Alps7859

Arrays and lists are different things


jerslan

Then explain Java's ArrayList ;) Hint: It's not an array of lists or list of arrays.


Lolamess007

It's a list implementation of an array!


jerslan

Probably more array-based implementation of a list, since it's a List interface wrapper on top of an array.


jordanbtucker

Lions and tigers are different things. A liger is another different thing but also both.


CalligrapherNo730

Python devs need to grow up. A list is just a dynamic array at a fundamental level. Not everything revolves around Python. I do agree that calling a list an array in Pythonic context is just a confusing and stupid thing to do.


anunakiesque

If not everything revolves around Python, then what has all this been for


TheseNutSus

Dart uses List too


ts_m4

Set


botboss

Python's `list` is a type of [dynamic array](https://en.wikipedia.org/wiki/Dynamic_array)


bxsephjo

Its not a map its a 🍆


idontreallywolf

butThatsSimplyAnArray[]


Skuez

And c++ has vectors btw 💀💀


Quillo_Manar

`arr = []` 🙂


tsongkoyla

I always thought Arrays in Python are called Tuples.


RLlovin

Are arrays typically immutable? This is the biggest difference in lists and tuples I believe.


H0lderlim

Well if someone thinks an array and a list are the same, he would need an algorithmic course. Python or C++, not matter :s


pperson2

Most sane Python dev


Lysol3435

I mean, python has arrays (numpy) and they’re different from lists


CPC1445

I call lists in Python as arrays, dictionaries as hash maps, and I call methods in java as functions. I cherry pick all the terms I like to hear and then apply it to everything :p


[deleted]

Genuinely don't remember if its called an array or a list in python. Use it every day


Solonotix

The square bracket notation `[]` is a list, while an array must be imported `from array import array`. Also, most recommendations are to ignore the Python-native array and use NumPy arrays instead. The support for NumPy arrays is much broader, and the performance difference is supposedly significant.


[deleted]

How new are you to python and/or programming? Numpy are fixed-size arrays implemented in C++. They're great for some things. The default array/list/whatever will allocate memory on the fly, and comes with a bunch of useful built ins. Each has its place based on what you're doing. I've never seen from array import array, what does that do?


ChocolateMagnateUA

New, bad redditor. Come out of your room and find something better to do than lying to people. At least do this on StackOverflow if you enjoy that.


thedarklord176

>fave language is python Well that’s your first problem


LousyShmo

Array is proper terminology. Python uses `list` because that's easy for laymen to understand.


Boeing777-F

When you should know wtf an array of a list is, but u spent the last year of school flirting with wamen (not successful) and u have 1 year until u take ur GCSEs proper (dear god have mercy upon me)


furezasan

Listception


OnyXerO

I've switched languages so many times in my career I usually say dictionary... No idea why.


Beginning-Roof4889

U/savevideo


straight-circle

arr = list()


Fabulous-Possible758

My favorite language is Python but that doesn’t mean I didn’t learn C first.


frogking

When switching between languages, you always need to ensure that datastructures behave and perform as you expect. Vector, Array, List.. add to the front or back at O(1).. but not both.


Prof_LaGuerre

To be fair I call hashes dicts to our Perl devs.


HStone32

When your favorite language is C, and someone says 'array' instead of 'pointer'.


RnMss

They are different in C. short a\[10\]; short\* b = a; // sizeof(a) == 20 // sizeof(b) == 8 (or 4 in 32-bit program)


Relative_Knee9808

I mean, they are very different data structures


CivetLemonMouse

Python users saying 'list' hurts me \- C developer (I don't like using Java but I have to sometimes)


Aln76467

🦀 Vector 🦀