Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding file size to views #3539

Merged
merged 28 commits into from
Apr 19, 2018
Merged

Adding file size to views #3539

merged 28 commits into from
Apr 19, 2018

Conversation

ashleytqy
Copy link
Contributor

@ashleytqy ashleytqy commented Apr 17, 2018

This PR addresses: #3521

  • Adds file size to files + notebooks, but not for directories (it's undefined for directories)
  • Displays it in the view, with sorting functionality
  • Filesize is humanized (originally in bytes)

@ckilcrease

Sort by ascending

screen shot 2018-04-18 at 1 09 37 pm

@Carreau
Copy link
Member

Carreau commented Apr 18, 2018

Thanks,

I haven't touched this part of the code base in a while so i'll defer to someone else for complete review.

You added looking at filesize in each model when doing a Get request. Was that necessary to get that in directory listing ? Or is the server doing a os.stat on every file when listing a directory ? If is is there might be a perf bottle neck we need to be careful there.

You seem to have 16 commits and a collaborator (good job to both of you for handling Git). Do you wish to reduce your commit number (rebase), to make history cleaner.

From your screenshot I can see that some files are 2B and below, and actually fraction of a Bytes.... Are you sure this is right ? Is there not a factor of 1000 somewhere ? (feel free to use an external library, assuming it is small enough, the license is right... etc). )

I'll do further inline comments.

@@ -814,6 +814,10 @@ definitions:
type: string
description: Last modified timestamp
format: dateTime
size:
type: integer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integer or None ? If size can't be found, probably say that in the description at least.

$("<span/>")
.addClass("file_size")
.addClass("pull-right")
.css("width", "65px")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You likely want to avoid inline css.

Copy link
Contributor Author

@ashleytqy ashleytqy Apr 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you know where the css should go? custom.css seems like a user-specific file, and not quite right. from what i've seen, most of the CSS is defined like this:

var add_uploading_button = function (f, item) {
// change buttons, add a progress bar
var uploading_button = item.find('.upload_button').text("Uploading");
uploading_button.off('click'); // Prevent double upload
var progress_bar = $('<span/>')
.addClass('progress-bar')
.css('top', '0')
.css('left', '0')
.css('width', '0')
.css('height', '3px')
.css('border-radius', '0 0 0 0')
.css('display', 'inline-block')
.css('position', 'absolute');

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely in notebook/static/tree/less/tree.less there are case when we do inline css, there might be a reason for progress-bar, maybe not, if not that's likely technical dept we need to fix at some point.

You'll likely need to recompile thee css with python setup.py css after modification of the less file.

@@ -835,6 +866,7 @@ define([
// Add in the date that the file was last modified
item.find(".item_modified").text(utils.format_datetime(model.last_modified));
item.find(".item_modified").attr("title", moment(model.last_modified).format("YYYY-MM-DD HH:mm"));
item.find(".file_size").html(utils.format_filesize(model.size) || "&nbsp;");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the use of .html() should be a red flag. From a security standpoint it would not pass an audit. We had security issues before because of HTML usage.

If somewhere where to replace you API to return cannot find <size> for file <file> the you get scrip injection in your page, and your client is compromised.

If you've seen use of .html() in other place of the javascript where .text() should/could be used, please open an issue.

Copy link
Contributor Author

@ashleytqy ashleytqy Apr 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: using .html() – we're using it so that &nbsp; renders because .text() doesn't display an empty string. do you have any suggestions to how to display a &nbsp; without using .html?

it's also used here, which i'm assuming the author also wanted to use the &nbsp; character

$('#counter-select-all').html(checked===0 ? '&nbsp;' : checked);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, there are several way to approach that. A simple one would be to use .html() just for &nbsp;:

if (cfoo){
	$(...).text(foo); 
} else {
    $(...).html("&nbsp");
}

That reduce the use of html so it's one less risk.

Thinking about it more, &nbsp; is also just an escape sequence for unicode character Non Breaking Space, which you can enter with your keyboard using some clever keyboard combinaison | | (nbsp), not like | |(space), check the source, on mac it's Alt Space. That's 1) annoying to type 2) annoying to see when reading code. That's not optimal. After thinking a few minutes more you can realize that hopefully most programming language allow you to have escape sequence, knowing that Non-Breaking space is unicode U+00A0 you can go with :

.text(checked===0 ? '\xA0' : checked); 

If you want to even avoid using jQuery you can use the innerText attribute and assign to it.

With your gained knowledge you can even open an issue for $('#counter-select-all').html(checked===0 ? '&nbsp;' : checked); to be fixed.

As an exercise – if you like the fun of hacking – get in a shoes of an attacker and try to modify the server side to not return the size as an integer but a string, and attempt to have a script injection that display a message in the frontend.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stepping back a bit from unicode discussions: do we actually need a non-breaking space at all? I think the layout should still work if the element has no text.

Copy link
Contributor Author

@ashleytqy ashleytqy Apr 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed it with using \xA0 (thanks @Carreau !) . Without text, the span element wrapping it doesn't seem to adhere to the width it's given.

@ashleytqy
Copy link
Contributor Author

ashleytqy commented Apr 18, 2018

@Carreau I tried to install and use pretty-bytes with bower but the library doesn't seem compatible with bower (it threw a module is not recognized error, so i think the library is better suited for an npm install which notebook doesn't use), so i copied over the code with credits if thats okay with you

size:
type: integer
description: "The size of the file or directory."
format: bytes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should just go in the description. Swagger specifies a fixed list of formats for things like dateTime. From the spec's point of view, this is just a number - I don't think it has a way of declaring a unit.

@@ -183,6 +183,14 @@ def is_hidden(self, path):
os_path = self._get_os_path(path=path)
return is_hidden(os_path, self.root_dir)

def _get_file_size(self, path):
try:
size = os.path.getsize(path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will do a new stat on the file, which is best avoided for performance reasons. We already get a stat object in _base_model() - perhaps we can use the size from that, and remove it again for directories. It's a bit of an ugly trick, but for a big directory, it could make a difference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@takluyver thank you -- I've updated so that it uses file size from st_size (retrieved from stat) in _base_model(), but let me know if this isn't what you meant

@@ -1038,7 +1038,34 @@ define([
return (order == 1) ? 1 : -1;
}
};

// source: https://github.com/sindresorhus/pretty-bytes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with this since it's one fairly small function, and it's unlikely to change. But we should include a copy of the license in the comment.

https://github.com/sindresorhus/pretty-bytes/blob/master/license

@takluyver
Copy link
Member

Looking at the screenshots, I think there should be a bit more whitespace between the last modified time and the size, so it's clear that they're separate fields. Maybe right-align the sizes?

@takluyver
Copy link
Member

Thanks, that was what I meant about using the existing stat info. :-)

I think this is looking good now, though I'm too tired to review it closely. But there's some problem with the JS tests on Travis - can you investigate?

@Carreau
Copy link
Member

Carreau commented Apr 18, 2018

Travis seem happy, Appveyor was not but this seem like an unrelated error. I restarted he build.

Copy link
Member

@Carreau Carreau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job !

try:
# size of file
size = info.st_size
except (ValueError, OSError):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there can be an error here - if lstat() returns, then we should have a valid stat object from which we can get st_size. But it shouldn't be a problem to be defensive in case I'm wrong.

@@ -814,6 +814,9 @@ definitions:
type: string
description: Last modified timestamp
format: dateTime
size:
type: integer
description: "The size of the file or notebook in bytes. If no size is provided, defaults to None."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is describing a JSON API, it's probably better to refer to null, which is the JSON/Javascript name, rather than None, which is from Python. This isn't worth holding the PR up for, so I'll change it myself and then merge.

@takluyver takluyver merged commit 5875235 into jupyter:master Apr 19, 2018
@takluyver
Copy link
Member

Thanks!

@takluyver takluyver added this to the 5.5 milestone Apr 24, 2018
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 1, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants