summaryrefslogtreecommitdiff
path: root/saved-articles/git-annex-metadata-views.txt
blob: 7c74a94b6f7e5d7a96c78213bc5c4c3dff6a7ffa (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
Annex Metadata Filtered Views

Git annex goes mindbogglingly deep.

I use git annex to manage my photos -- I have tons of RAW photos in
`~/Pictures` that are backed-up to a few git annex special remotes.

Whenever I'm done editing a set of raw files I go through this song and
dance:

``` 
cd raw
git annex add
git annex copy --to=s3
git annex copy --to=nas
git commit -m 'Added a bunch of files to s3'
git push origin master
git push origin git-annex
git annex drop
```
And magically my raw files are available on s3 and my home NAS.

If I jump over to a different machine I can run:

``` 
cd ~/Pictures
git fetch
git rebase
git annex get [new-raw-file]
```
Now that raw file is available on my new machine, I can open it in
Darktable, I can do whatever I want to it: it's just a file.

This is a pretty powerful extension of git.

While I was reading the [git annex
internals](https://git-annex.branchable.com/internals/) page today I
stumbled across an even more powerful feature: metadata. You can store
and retrieve arbitrary metadata about any git annex file. For instance,
if I wanted to store EXIF info for a particular file, I could do:

``` 
git annex metadata 20140118-BlazeyAndTyler.jpg --set exif="$(exiftool -S \
  -filename \
  -filetypeextensions \
  -make \
  -model \
  -lensid \
  -focallength \
  -fnumber \
  -iso 20140118-BlazeyAndTyler.jpg)"
```

And I can drop that file and still retrieve the EXIF data

``` 
$ git annex drop 20140118-BlazeyAndTyler.jpg
drop 20140118-BlazeyAndTyler.jpg (checking tylercipriani-raw...) (checking tylercipriani-raw...) (checking tylercipriani-raw...) ok
(recording state in git...)
$ git annex metadata --get exif !$
git annex metadata --get exif 20140118-BlazeyAndTyler.jpg
FileName: 20140118-BlazeyAndTyler.jpg
Make: SAMSUNG
Model: SPH-L720
FocalLength: 4.2 mm
FNumber: 2.2
ISO: 125
```

This is pretty neat, but it can also be achieved with [git
notes](https://tylercipriani.com/blog/2016/08/26/abusing-git-notes/) so
it's nothing too spectacular.

But git annex metadata doesn't quite stop there.

My picture directory is laid out like this:

```
Pictures/
└── 2015
    └── 2015-08-14_Project-name
        ├── bin
        │   └── convert-and-resize.sh
        ├── edit
        │   ├── 2015-08-14_Project-name_00001.jpg
        │   └── 2015-08-14_Project-name_00002.jpg
        └── raw
            └── 2015-08-14_Project-name_00001.NEF
```

I have directories for each year, under those directories I create
directories that are prefixed with the [ISO
8601](https://xkcd.com/1179/) import date for the photo, some memorable
project name (like `mom-birthday` or `rmnp-hike`). Inside that directory
I have 2 directories: `raw` and `edit`. Inside each one of those
directories, I have photos that are named with ISO 8601 import date,
project name, and 5-digit import number and a file extension -- raw
files go in `raw` and edited/finished files `edit`.

I got this system from [Riley
Brandt](http://www.rileybrandt.com/lessons/) (I can't recommend the
Open Source Photography Course enough -- it's amazing!) and it's
served me well. I can find stuff! But git annex really expands the
possibilities of this system.

## Fictional, real-world, totally real actually happening scenario [¶](https://tylercipriani.com/blog/2016/09/28/git-annex-metadata-filtered-views/#fictional-real-world-totally-real-actually-happening-scenario)

I go to Rocky Mountain National Park (RMNP) multiple times per year.
I've taken a lot of photos there. If I take a trip there in October I
will generally import those photos in October and create
`2015/2015-10-05_RMNP-hike/{raw,edit}`, and then if I go there again
next March I'd create
`2016/2016-03-21_RMNP-daytrip-with-blazey/{raw,edit}`. So if I want to
preview my RMNP edited photos from October I'd go:

```
cd 2015/2015-10-05_RMNP-hike/edit
git annex get
geeqie .
```

But what happens if I want to see all the photos I've ever taken in
RMNP? I could probably cook up some script to do this. Off the top of my
head I could do something like `find . -iname '*rmnp*' -type l`, but
that would undoubtedly miss some files from a project in RMNP that I
didn't name with the string `rmnp`. Git annex gives me a different
option: metadata tags.

## Metadata Tags [¶](https://tylercipriani.com/blog/2016/09/28/git-annex-metadata-filtered-views/#metadata-tags)

Git annex supports a special type of short metadata -- `--tag`. With
`--tag`, you can tag individual files in your repo.

The WMF reading team offsite in 2016 was partially in RMNP, but I
didn't name any photos `RMNP` because that wasn't the most memorable
bit of information about those photos (`reading-team-offsite` seemed
like a better project name) nor did `RMNP` represent all the photos from
the offsite. I should tag a few of those photos `rmnp` with git annex:

```
$ cd ./2016/2016-05-01_wikimedia-reading-offsite/edit/
$ git annex metadata --tag rmnp Elk.jpg
metadata Elk.jpg 
  lastchanged=2016-09-29@04-14-44
  tag=rmnp
  tag-lastchanged=2016-09-29@04-14-44
ok
(recording state in git...)
$ git annex metadata --tag rmnp Reading\ folks\ bing\ higher\ up\ than\ it\ looks.jpg
metadata Reading folks bing higher up than it looks.jpg 
  lastchanged=2016-09-29@04-14-57
  tag=rmnp
  tag-lastchanged=2016-09-29@04-14-57
ok
(recording state in git...)
```

Also, when my old roommate came to town we went to RMNP, but I tagged
those photos `cody-family-adventure-time`. So let's tag a few of those
`rmnp`, too:

```
$ cd 2015/2016-01-25_cody-family-adventuretime/edit
$ git annex metadata --tag rmnp alberta-falls.jpg
metadata alberta-falls.jpg 
  lastchanged=2016-09-29@04-17-48
  tag=rmnp
  tag-lastchanged=2016-09-29@04-17-48
ok
(recording state in git...)
```

## Metadata views [¶](https://tylercipriani.com/blog/2016/09/28/git-annex-metadata-filtered-views/#metadata-views)

Now the thing that was really surprising to me, you can filter the whole
pictures directory based on a particular tag with git annex by using a
[metadata driven
view](https://git-annex.branchable.com/tips/metadata_driven_views/).

```
$ tree -d -L 1
.
├── 2011
├── 2012
├── 2013
├── 2014
├── 2015
├── 2016
├── instagram
├── lib
├── lossy
├── nasa
└── Webcam

$ git annex view tag=rmnp
view  (searching...) 
Switched to branch 'views/(tag=rmnp)'
ok
$ ls
alberta-falls_%2015%2016-01-25_cody-family-adventuretime%edit%.jpg
Elk_%2016%2016-05-01_wikimedia-reading-offsite%edit%.jpg
Reading folks bing higher up than it looks_%2016%2016-05-01_wikimedia-reading-offsite%edit%.jpg
```

I can even filter this view using other tags with
`git annex vfilter tag=whatever`. And I can continue to edit, refine,
and work with the photo files from there.

This feature absolutely blew my mind -- I dropped what I was doing to
write this -- I'm trying to think of a good way to work it into my
photo workflow