python - Batch download google images with tags -


i'm trying find efficient , replicable way batch download full-size image files google image search. other people have asked similar things, haven't found that's i'm looking or understand.

most refer depreciated google image search api or google custom search api doesn't seem work whole web, or downloading images single url.

i imagine 2 step process: first, pull image urls search , batch download those?

i should add beginner (which obvious; sorry). if explain point me in right direction, appreciated.

i've looked freeware options, seem spotty well. unless knows of reliable one.

download images google image search (python)

in python, there way can download all/some image files (e.g. jpg/png) **google images** search result?

and if know labels , if exist somewhere/are associated images? https://en.wikipedia.org/wiki/google_image_labeler

import json import os import time import requests pil import image stringio import stringio requests.exceptions import connectionerror  def go(query, path): """download full size images google image search. don't print or republish images without permission. used train learning algorithm. """ base_url = 'https://ajax.googleapis.com/ajax/services/search/images?'\          'v=1.0&q=' + query + '&start=%d'  base_path = os.path.join(path, query)   if not os.path.exists(base_path):  os.makedirs(base_path)  start = 0 # google's start query string parameter pagination. while start < 60: # google return max of 56 results. r = requests.get(base_url % start) image_info in json.loads(r.text)['responsedata']['results']:   url = image_info['unescapedurl']   try:     image_r = requests.get(url)   except connectionerror, e:     print 'could not download %s' % url     continue    # remove file-system path characters name.   title = image_info['titlenoformatting'].replace('/', '').replace('\\', '')    file = open(os.path.join(base_path, '%s.jpg') % title, 'w')   try:     image.open(stringio(image_r.content)).save(file, 'jpeg')   except ioerror, e:     # throw away gifs...blegh.     print 'could not save %s' % url     continue   finally:     file.close()  print start start += 4 # 4 images per page.  # nice google , they'll nice :) time.sleep(1.5)  # example use go('landscape', 'mydirectory') 

update

i able create custom search using full web specified here, , execute image links, mentioned in previous post, don't align normal google image results.

try using imagesoup module. install it, simply:

pip install imagesoup 

a sample code:

>>> imagesoup import imagesoup >>> >>> images_wanted = 50 >>> query = 'landscape' >>> images = soup.search(query, n_images=50) 

now have list 50 landscape images google images. let's play first one:

>>> im.url https://static.pexels.com/photos/279315/pexels-photo-279315.jpeg >>> im.size (2600, 1300) >>> im.mode rgb >>> im.dpi (300, 300) >>> im.color_count 493230 >>> # let's check main 4 colors in image. use >>> # reduce_size = true speed process. >>> im.main_color(reduce_size=true, n=4)) [('black', 0.2244), ('darkslategrey', 0.1057), ('darkolivegreen', 0.0761), ('dodgerblue', 0.0531)] # let's take on our image >>> im.show() 

enter image description here

>>> # nice image! let's save it. >>> im.to_file('landscape.jpg') 

the number of images returned each search may change. number smaller 900. if want images, set n_images=1000.

to contribute or report bugs, check github repo: https://github.com/rafpyprog/imagesoup


Comments

Popular posts from this blog

routing - AngularJS State management ->load multiple states in one page -

python - GRASS parser() error -

json - Gson().fromJson(jsonResult, Myobject.class) return values in 0's -