Basic Array and Pointers in C
Review of array and pointers.
Shortcut for Quick Review:
Update (2020.12.05): Fix typo in description, (&a
and &a[0]
have different types, so &a
is not an alias of &a[0]
) (More on Pointers and Array)
Structure
We’ll start by reviewing some basics about structures.
If we want to make a 2D game with different objects, we may need to have its position coordinates and size stored in arrays. Something like:
float enemy_x[MAX];
float enemy_y[MAX];
float enemy_w[MAX];
float enemy_h[MAX];
float player_x;
float player_y;
float player_w;
float player_h;
However, it gets confusing once the variable count increases. So Structures come into rescue:
struct object {
float x, y, w, h;
};
struct object enemy[MAX];
struct object player;
and then, we can access the coordinates easily like:
player.x = 0;
The struct object
declaration is quite long and kind of redundant, so we can use typedef
:
struct object {
float x, y, w, h;
};
typedef struct object Object;
Object enemy[MAX];
Object player;
We can even combine the declaration of struct
with typedef
.
typedef struct object {
float x, y, w, h;
} Object;
Object enemy[MAX];
Object player;
and even omit the object
type:
typedef struct {
float x, y, w, h;
} Object;
Object enemy[MAX];
Object player;
By using Structures, we can define a special type for variable declaration afterward. More advanced usages such as bit-fields, union, and enumeration can provide more flexible controls on Structures.
Arrays
For using static arrays, we can only declare it with constant length.
-
Static array initialization
What is the result of:
#include <stdio.h> #define MAX 3 int a[MAX]; int b[MAX] = {}; int c[MAX] = {0}; int d[MAX] = {1}; int e[MAX] = {1, 1, 1}; int main(void) { for (int i = 0; i < MAX; i++) printf("%d ", a[i]); for (int i = 0; i < MAX; i++) printf("%d ", b[i]); for (int i = 0; i < MAX; i++) printf("%d ", c[i]); for (int i = 0; i < MAX; i++) printf("%d ", d[i]); for (int i = 0; i < MAX; i++) printf("%d ", e[i]); return 0; }
Click to expand Answer
✔️ Answer
0 0 0 0 0 0 0 0 0 1 0 0 1 1 1
All entries are initialized into zero when the bracket
= {}
initialization is used. Using= {1}
is kind of like initialize all entries into zero but the first entry should be 1. So something like= {0}
is actually a redundant version of= {}
. -
Multi-dimensional Static Array
What is the result of:
#include <stdio.h> #define N 3 #define M 2 int a[N][M] = {{1}, {2, 3}}; int main(void) { for (int i = 0; i < N; i++) { for (int j = 0; j < M; j++) { printf("a[%d][%d] is %d\n", i, j, a[i][j]); } } return 0; }
Click to expand Answer
✔️ Answer
a[0][0] is 1 a[0][1] is 0 a[1][0] is 2 a[1][1] is 3 a[2][0] is 0 a[2][1] is 0
This declaration can be seen as declaring a size-3 array, and each array cell is a size-2 array.
int (a[3])[2];
-
Variable-Length Arrays (VLA)
#include <stdio.h> int main(void) { int N; scanf("%d", &N); int a[N]; return 0; }
You may see some code that declares an array as above, where
N
is not a constant value. They are called Variable-Length Arrays and are only supported after C99. There are some drawbacks to using VLAs, to avoid confusion, you can use the dynamic array as shown in the next section.
Pointers
Misusing pointers can pose serious security threats to programs. You’ll often see Buffer Overflow security fixes in Google Chrome, Mozilla Firefox, or any other applications. These security issues are mostly caused by pointers. Thus, in many high-level languages, the direct manipulation of pointers is forbidden.
If pointers are so dangerous, then why use pointers in the first place? This is because some cases require using pointers:
- Function returning more than one value
- Send arguments more efficiently
- Access dynamic allocated memory
- Implement data structures like linked lists, trees, etc.
- Call functions through a variable
- etc.
-
Function returning more than one value
Before starting let’s take a look at the snippet below:
int a, b; a = 1; b = a;
For the second line,
a
is on the left-hand side of the assignment operator. The value1
is written toa
’s memory address.For the third line,
a
is on the right-hand side of the assignment operator. The value ofa
is read from its memory address and written tob
’s memory address (I’ll ignore registers and caches here for the sake of simplicity).If we want to change
a
’s value through a function call, we can only assign it by follows:int a; a = func_a();
If we want to change multiple variables’ value, we may want to type the following:
a, b = func()
which works in Python, but not in C. So, we need to find a way to send the variables’ address into the function, and let the code inside the function change its value. However, something like:
int a, b; func(a, b);
doesn’t work since the value is read from the memory addresses before sending into the function, if we change the variables inside the function, it only changes the value of the local variables inside that function. So, we send their memory addresses instead:
int a, b; func(&a, &b);
which is how
scanf
works, and the return value can now be used to indicate whether the function runs successfully.In the case above, we wanted to change multiple variables’ value. What if the value we want to change is a memory address?
int* a; a = malloc(sizeof(int)); free(a);
If we don’t want to use the return value of the function, then it’ll become something like:
#include <stdio.h> #include <stdlib.h> void my_malloc(int** a, size_t size) { *a = malloc(size); } int main(void) { int* a; my_malloc(&a, sizeof(int)); free(a); return 0; }
Sending the pointer
a
doesn’t work here since we can only change the value stored in the memory address stored ina
(i.e.,*a
). If we want to change the memory address stored ina
(a
), we need to senda
’s memory address into the function.This double-pointer allocation is commonly used in CUDA programming and also used a lot when interacting with Windows native APIs.
-
Send arguments more efficiently
What’s the output of:
#include <stdio.h> #define MAX 100 typedef struct { int content[MAX]; } LargeArray; void func(LargeArray la) { printf("sizeof la is %zu\n", sizeof la); } void funcp(LargeArray* pla) { printf("sizeof pla is %zu\n", sizeof pla); printf("sizeof la is %zu\n", sizeof *pla); } int main(void) { LargeArray la; func(la); funcp(&la); return 0; }
Click to expand Answer
✔️ Answer
sizeof la is 400 sizeof pla is 8 sizeof la is 400
You can see that the variable
la
is quite large and requires an additional copy step to send it intofunc
. But we can send the structure’s pointer instead, so we only need to copyla
’s address, which is much smaller. -
Access dynamic allocated memory
We can allocate dynamic resources for later usages. For example a 1d dynamic array:
#include <stdio.h> #include <stdlib.h> #define MAX 100 int main(void) { int* a; a = malloc(sizeof(int) * MAX); free(a); return 0; }
Freeing (
free
) any allocated (malloc
) resources is a good coding practice. Forgetting to free allocated resources results in memory leaks, which is a major issue for products and applications.When developing applications/games, we may want to allocate some resources when doing a certain operation, where this operation is repetitively called. If we forget to free these resources, the program will keep asking for more memory spaces from the OS, and eventually use up all memory and crashes. (Sometimes the OS kills the program, but sometimes the OS crashes due to Out Of Memory issue (OOM)) A dangerous code is shown below, which may crash your OS:
#include <stdio.h> #include <stdlib.h> #define MAX 1000000 void game_update() { int* a; a = malloc(sizeof(int) * MAX); // Do something... // and forget to free this resource. // free(a); } int main(void) { while (1) { game_update(); } return 0; }
-
Implement data structures like linked lists, trees, etc.
Linked List:
typedef struct _node { int value; struct _node* next; } Node;
Binary Tree:
typedef struct _node { int value; struct _node *left, *right; } Node;
-
Call functions through a variable
What is the result of:
#include <stdio.h> #include <stdlib.h> #define MAX 1000000 void add(int a, int b) { printf("add : %d\n", a+b); } void mult(int a, int b) { printf("mult: %d\n", a*b); } int main(void) { void (*fptr) (int, int); fptr = &add; fptr(2, 3); fptr = &mult; fptr(2, 3); return 0; }
Click to expand Answer
✔️ Answer
add : 5 mult: 6
This increases a lot of flexibility! The native function
qsort
instdlib.h
uses a function pointer to let the programmer define its own comparison function for sorting arrays of different types/structures.
More on Pointers and Array
Arrays are not equal to Pointers, they are similar but not the same. In most of the time, they can be used in the same way.
This part is the most interesting, we’ll start by reviewing basic operators.
Operator review:
&var
returnsvar
’s memory address.*ptr = c
assignc
to the address whereptr
is pointing.r = *ptr
takes out the value whereptr
is pointing and store it inr
.-
ptr++
Move the pointer forward forsizeof(*ptr)
-byte, that is, to move n-bytes, where n is the size of the type thatptr
is pointing to.e.g.
int* ptr
, ifptr == 0x00
,ptr+1 == 0x04
, assumingsizeof(int)
is 4. -
The arithmetic of
a
is the same with&a[0]
. (However, they have different types, the former is an array, the latter is a pointer)e.g.
int a[100]
, ifa == 0x00
,&a[0] == 0x00
,a+1 == 0x04
,&a[0]+1 == 0x04
, assumingsizeof(int)
is 4. -
a[i]
is an alias of*(a+i)
.So,
i[a]
is valid! (which is equal toa[i]
) a[i][j]
is an alias of*(*(a+i)+j)
.struct_ptr->var
is an alias of(*struct_ptr).var
.
-
Size of Pointers and Arrays.
What is the output of:
#include <stdio.h> #include <stdlib.h> #define N 2 #define M 3 int main(void) { // Declaration int i, j, *tmp; int a[N*M]; int b[N][M]; int c[M][N]; int *pa = malloc(sizeof(int) * M * N); int **pb = malloc(sizeof(int*) * N); int **pc = malloc(sizeof(int*) * M); // Allocate second level resources all at once to reduce allocation overhead. tmp = malloc(sizeof(int) * M * N); for (i = 0; i < N; i++) pb[i] = tmp + i * M; tmp = malloc(sizeof(int) * M * N); for (i = 0; i < M; i++) pc[i] = tmp + i * N; // Print printf("sizeof(int) : %2zu\n", sizeof(int)); printf("sizeof(int[M]) : %2zu\n", sizeof(int[M])); printf("sizeof(int[N]) : %2zu\n", sizeof(int[N])); printf("sizeof(int[N][M]) : %2zu\n", sizeof(int[N][M])); printf("sizeof(int[M][N]) : %2zu\n", sizeof(int[M][N])); printf("sizeof(void*) : %2zu\n", sizeof(void*)); printf("sizeof(int*) : %2zu\n", sizeof(int*)); printf("sizeof(int**) : %2zu\n", sizeof(int**)); printf("sizeof a : %2zu\n", sizeof a); printf("sizeof *a : %2zu\n", sizeof *a); printf("sizeof b : %2zu\n", sizeof b); printf("sizeof *b : %2zu\n", sizeof *b); printf("sizeof **b : %2zu\n", sizeof **b); printf("sizeof c : %2zu\n", sizeof c); printf("sizeof *c : %2zu\n", sizeof *c); printf("sizeof **c : %2zu\n", sizeof **c); printf("sizeof pa : %2zu\n", sizeof pa); printf("sizeof *pa : %2zu\n", sizeof *pa); printf("sizeof pb : %2zu\n", sizeof pb); printf("sizeof *pb : %2zu\n", sizeof *pb); printf("sizeof **pb : %2zu\n", sizeof **pb); printf("sizeof pc : %2zu\n", sizeof pc); printf("sizeof *pc : %2zu\n", sizeof *pc); printf("sizeof **pc : %2zu\n", sizeof **pc); // Free free(pa); free(pb[0]); free(pb); free(pc[0]); free(pc); return 0; }
Click to expand Answer
✔️ Answer
sizeof(int) : 4 sizeof(int[M]) : 12 sizeof(int[N]) : 8 sizeof(int[N][M]) : 24 sizeof(int[M][N]) : 24 sizeof(void*) : 8 sizeof(int*) : 8 sizeof(int**) : 8 sizeof a : 24 sizeof *a : 4 sizeof b : 24 sizeof *b : 12 sizeof **b : 4 sizeof c : 24 sizeof *c : 8 sizeof **c : 4 sizeof pa : 8 sizeof *pa : 4 sizeof pb : 8 sizeof *pb : 8 sizeof **pb : 4 sizeof pc : 8 sizeof *pc : 8 sizeof **pc : 4
First, remember, pointers are just variables that store memory addresses, so
void*
,int*
,int**
, or evenint*****
should be 8-bytes, assuming using a 64-bit computer. Let’s explain each results:Target Type Size sizeof a : 24
int[6]
sizeof(int[6])
sizeof *a : 4
int
sizeof(int)
sizeof b : 24
int[2][3]
sizeof(int[2][3])
sizeof *b : 12
int[3]
sizeof(int[3])
sizeof **b : 4
int
sizeof(int)
sizeof c : 24
int[3][2]
sizeof(int[3][2])
sizeof *c : 8
int[2]
sizeof(int[2])
sizeof **c : 4
int
sizeof(int)
sizeof pa : 8
int*
sizeof(int*)
sizeof *pa : 4
int
sizeof(int)
sizeof pb : 8
int**
sizeof(int**)
sizeof *pb : 8
int*
sizeof(int*)
sizeof **pb : 4
int
sizeof(int)
sizeof pc : 8
int**
sizeof(int**)
sizeof *pc : 8
int*
sizeof(int*)
sizeof **pc : 4
int
sizeof(int)
-
Array as Arguments
What is the output of:
#include <stdio.h> #include <stdlib.h> void func1(int a[]) { printf("%zu\n", sizeof a); } void func2(int *a) { printf("%zu\n", sizeof a); } int main(void) { int a[100]; printf("%zu\n", sizeof a); func1(a); func2(a); return 0; }
Click to expand Answer
✔️ Answer
400 8 8
Note that
sizeof a
is determined by the compiler when compiling to assembly.The
int a[]
is just syntactic sugar forint* a
, they are both pointers (array decayed into pointer). -
Global array vs. Local array
What is the difference between
#include <stdio.h> #define MAX 100000000 int main(void) { int a[MAX]; a[MAX-1] = 10; printf("%d\n", a[MAX-1]); return 0; }
and
#include <stdio.h> #define MAX 100000000 int a[MAX]; int main(void) { a[MAX-1] = 10; printf("%d\n", a[MAX-1]); return 0; }
Click to expand Answer
✔️ Answer The first code segment (
a
defined inmain
) may result in Stack Overflow (which shows Segmentation Fault), while the second one (a
in global) will not cause Stack Overflow.More details on this can be learned in the Computer Architecture course.
-
Pointer Compatibility
int * pt; int (*pa)[3]; int ar1[2][3]; int ar2[3][2]; int **p2; int *arp[3];
Which of the operations below are valid?
pt = &ar1[0][0]; pt = ar1[0]; pt = ar1; pa = ar1; pa = ar2; p2 = &pt; *p2 = ar2[0]; p2 = ar2; pa = arp;
Click to expand Answer
✔️ Answer
Target Valid? L-Type R-Type pt = &ar1[0][0];
$\color{green} ✔$ Valid pointer-to-int pointer-to-int pt = ar1[0];
$\color{green} ✔$ Valid pointer-to-int pointer-to-int pt = ar1;
$\color{red} ✘$ Invalid pointer-to-int pointer-to-int[3] pa = ar1;
$\color{green} ✔$ Valid pointer-to-int[3] pointer-to-int[3] pa = ar2;
$\color{red} ✘$ Invalid pointer-to-int[3] pointer-to-int[2] p2 = &pt;
$\color{green} ✔$ Valid pointer-to-int* pointer-to-int* *p2 = ar2[0];
$\color{green} ✔$ Valid pointer-to-int pointer-to-int p2 = ar2;
$\color{red} ✘$ Invalid pointer-to-int* pointer-to-int[2] pa = arp;
$\color{red} ✘$ Invalid pointer-to-int[3] pointer-to-int* The pointer can only be assigned if the type matches. Something like
ar1[2][3]
cannot be seen asint**
, but should be seen as pointer-to-int[3].For the difference of
pa
andarp
here is thatpa
is a pointer-to-int[3], whilearp
is an array of pointers that can be seen as pointer-to-int*. -
Pointer Arithmetic
What is the output of:
#include <stdio.h> int main(void) { int a[3] = {0, 1, 2}; int *p; for (int i = 0; i < 3; i++) { p = a + i; printf("%td\n", (char*)p-(char*)a); printf("%td\n", (p-a)); } return 0; }
Click to expand Answer
✔️ Answer
0 0 4 1 8 2
Note the difference between types.
-
Pointing one past the last element of the array
What is the output of:
#include <stdio.h> int main(void) { int a[3] = {0, 1, 2}; int *p; for (int i = 0; i <= 3; i++) { p = a + i; printf("%td\n", (char*)(p)-(char*)a); printf("%td\n", (p-a)); } return 0; }
Click to expand Answer
✔️ Answer
0 0 4 1 8 2 12 3
Pointing to one past the last element of the array is valid, as long as you do not dereference the pointer.
-
Pointer Out-of-bounds
What is the output of:
#include <stdio.h> int main(void) { int a[3] = {0, 1, 2}; int *p; for (int i = 0; i <= 4; i++) { p = a + i; printf("%td\n", (char*)(p)-(char*)a); printf("%td\n", (p-a)); } return 0; }
Click to expand Answer
✔️ Answer
Undefined Behavior.
Pointing out-of-bounds is undefined behavior, even without dereferencing the pointer, according to C’s standard.
Constants
const int* ptr
is the same asint const* ptr
, which protects the value pointed.*ptr = 0; // Not allowed ptr = &a; // Allowed
int * const ptr
, which protects the pointer variable itself.*ptr = 0; // Allowed ptr = &a; // Not allowed
const int * const ptr
is the same asint const * const ptr
.*ptr = 0; // Not allowed ptr = &a; // Not allowed
Interpretation on Syntaxes
-
Arrays
For
int a[2][3]
, it can be interpreted from the inside to the outside. Soa
is an array with 2 cells. Each of the cells contains anint[3]
element, that is, an array with 3 cells storing int. -
Pointers
You may interpret it in 2 ways:
int* ptr
,ptr
is a variable, and its type isint*
, that is, a variable storing memory address.int (*ptr)
,ptr
is a pointer, and the pointed memory address represents anint
.
For
int (*pa)[3]
, it can also be interpreted from the inside to the outside. Sopa
is a pointer, pointing to anint[3]
element.For
int *ap[3]
, it is an array of 3 cells. Each cell stores anint*
, that is, a pointer-to-int. (Left-Right Rule)
There are much more tricky problems on pointers and arrays, but by thinking hard on the questions above and understanding the underlying reason for the results should be quite enough.
If you have trouble understanding a C pointer statement, try out this website:
Note that the result of sizeof
operator in the questions above may have different outputs on different computers.
This should be the last assignment for this semester, I hope you have better understandings of the C language through finishing these six assignments.
Epilogue
If you feel frustrated when trying to understand these concepts, it should be a good sign. Since it means that you reinforced these important concepts that cannot be learned just by writing codes for Online Judges.
Comments