Created on 19 Jan 2021 ;    Modified on 23 Jan 2021

How create a minimal flask project (part 4: creating a sitemap.xml)


This is the fourth part of an article about Flask, as follows:

Objective

This is the fourth part of a Flask project to show a single page in two different versions:

  • single language;
  • two languages.

Here we speak how create a sitemap.xml about our web site.

Previous article was about the importance to set a different URL for each page. It's a matter of recognizability by the search engine: e.g. Google.

But how do we submit to Gooogle our URLs? The simplest method is sending to Google, or other search engine, a sitemap.xml of our site.

A sitemap.xml is a set of URLs, written using the xml text format.

Methodology

Writing a sitemap.xml by hand would be very tedious. So we are going to use a little library to help us: flask-sitemap.

In this case we'll work on both the blueprints: single_page/onel and single_page/twoels.

Development

Let' start installing flask-sitemap using the console:

1  # installation
2  >cd flask_single_page
3  >venv\Scripts\activate                   # activate python's virtual environment
4  (venv) >pip install flask-sitemap        # install flask-sitemap
5  ...

Then we modify single_page/__init__.py to initialize the library:

 1  from flask import Flask
 2  from flask_babel   import Babel
 3  from flask_sitemap import Sitemap                            # +
 4 
 5  babel = Babel()
 6  sitemap = Sitemap()
 7 
 8  def create_app():
 9      '''create and configure the app'''
10      app = Flask(__name__)
11 
12      app.config.from_mapping(
13          SECRET_KEY='leave-hope-to-enter',
14          LANGUAGES = {'en': 'english', 'it': 'italiano',},
15      )
16 
17      from .oneel import views as views1
18      app.register_blueprint(views1.oneel)
19 
20      from .twoels import views as views2
21      app.register_blueprint(views2.twoels)
22 
23      babel.init_app(app)
24      sitemap.init_app(app)                                    # +
25 
26      return app

Now we need a function to generate the list of URLs. We begin changing file single_page/oneel/views.py.

 1  from datetime import datetime                                          # +
 2  from flask import Blueprint, render_template
 3  from single_page import sitemap                                        # +
 4 
 5  # this app will respond to srv/1l/... URLs
 6  oneel = Blueprint('oneel',
 7                    __name__,
 8                    static_folder='static',
 9                    template_folder='templates',
10                    url_prefix='/1l')
11 
12  @oneel.route('/')                 # index URLs
13  @oneel.route('/index')
14  @oneel.route('/index.html')
15  def index():
16      return render_template('index.html', title='single language title')
17 
18 
19  @sitemap.register_generator                                   # +
20  def index():
21      '''generate URLs using language codes
22 
23         Note. used by flask-sitemap
24      '''
25      yield 'oneel.index', {}, datetime.now(), 'monthly', 0.7

Here the focal point is the decorator @sitemap.register_generator and its controlled index() function. The last is a generator of all URLs of oneel; only one in this case: 1l/index.html. We'll see a more interesting example in twoels.

Similarly we modify simple_page/twoels/views.py. As follows:

 1  from datetime import datetime                                          #+
 2  from flask import Blueprint, current_app, render_template, request, g
 3  from flask_babel import _
 4  from single_page import babel
 5  from single_page import sitemap                                        # +
 6 
 7  # this app will respond to srv/<lang_code>/2l/... URLs
 8  twoels = Blueprint('twoels',
 9                     __name__,
10                     static_folder='static',
11                     template_folder='templates',
12                     url_prefix='/<lang_code>/2l')
13 
14  @twoels.url_defaults
15  def add_language_code(endpoint, values):
16      values.setdefault('lang_code', g.lang_code)
17 
18  @twoels.url_value_preprocessor
19  def pull_lang_code(endpoint, values):
20      lc = values.get('lang_code', None)
21      if lc in current_app.config['LANGUAGES']:
22          g.lang_code = values.pop('lang_code')                # we'll use this even to set request.accept_languages
23      else:
24          raise ValueError(f"language {lc} is not accepted, because not in {tuple(current_app.config['LANGUAGES'].keys())}")
25 
26  @twoels.route('/')                 # index URLs
27  @twoels.route('/index')
28  @twoels.route('/index.html')
29  def index():
30      # default language code is in babel.default_locale
31      return render_template('2lindex.html', title=_('two languages title'))
32 
33  @babel.localeselector
34  def get_locale():
35      #<! to test Italian language: configure web browser OR ...
36      #   ... decomment the following line of code
37      #return 'it'
38      if g.get('lang_code', None):
39          return request.accept_languages.best_match([g.lang_code,])
40      return request.accept_languages.best_match(current_app.config['LANGUAGES'].keys())
41 
42  @sitemap.register_generator                                                      # +
43  def index():
44      '''generate URLs using language codes
45 
46         Note. used by flask-sitemap
47      '''
48      for lc in current_app.config['LANGUAGES'].keys():
49          g.lang_code = lc                               # used by add_language_code
50          yield 'twoels.index', {}, datetime.now(), 'monthly', 0.7

In this module function index() controlled from decorator @sitemap.register_generator is more complex. Here to create URLs we need to set the language code. So we cicle through all language codes, assigning it to g.code_lang, before to call flask-sitemap for twoels.index. Function add_language_code will use the g.code_lang to concur to the construction of the URLs of twoels.

To flask-sitemap we give a tuple of five elements. Left to right:

  • the endpoint to use,
  • a (here empty) dictionary of values used from the endpoint,
  • date of creation of the page (here to now()),
  • frequency of modification of the page;
  • priority of the URL.

Now, if we execute python run.py, and we ask URL http://localhost:5000/sitemap.xml we get:

single page, sitemap

That is not a thrilling sight. If we request browser to show its markup text we obtain:

single page, sitemap content

surely a lot better.

Just for sake of future readability, we can write the endpoint to handle explicitly the url http://localhost:5000/sitemap.xml. Modifing single_page/__init__.py:

 1  from flask import Flask
 2  from flask_babel   import Babel
 3  from flask_sitemap import Sitemap
 4 
 5  babel = Babel()
 6  sitemap = Sitemap()
 7 
 8  def create_app():
 9      '''create and configure the app'''
10      app = Flask(__name__)
11 
12      app.config.from_mapping(
13          SECRET_KEY='leave-hope-to-enter',
14          LANGUAGES = {'en': 'english', 'it': 'italiano',},
15      )
16 
17      from .oneel import views as views1
18      app.register_blueprint(views1.oneel)
19 
20      from .twoels import views as views2
21      app.register_blueprint(views2.twoels)
22 
23      babel.init_app(app)
24      sitemap.init_app(app)
25 
26      @app.route('/sitemap')                                                # +
27      def ep_sitemap():                            # endpoint for sitemap
28          return sitemap.sitemap(), 200, {'Content-Type': 'text/xml', }
29 
30      return app

we add in tail of the factory function create_app the enpoint ep_sitemap. This returns an xml response, built by sitemap.sitemap(), and it declares it as a document of text/xml type.

Enjoy, ldfa