storing ext3 in a sql database





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















Would it be possible to store an ext3-like system in mysql? For example, to build a file metadata system that would include paths and permissions? If so, what would be an example format of it? I suppose the starting denormalized route would be here:



-rw-r--r--@  1 david  staff   552B Nov  2 15:55 wsgi.py
-rw-r--r-- 1 david staff 113B Oct 30 20:45 ~.bash_profile


So we would have 7 columns:



permissions | links | user | group | size | date | filename


I suppose we would need an additional two for 'management':



fileId | directoryId


Does this seem way off base, or would that be possible? What would be a better and more normalized way to store it?










share|improve this question























  • Are you asking to store/manage the result of a directory listing, or are you asking if one could replace ext3 by a sql database?

    – Solarflare
    Nov 17 '18 at 3:04











  • @Solarflare I mean just the directory metadata -- where the files are located and what permissions they have.

    – David L
    Nov 17 '18 at 4:56











  • Sorry, I am still not really sure what you are looking for. Do you want us to add all the information to your erd that we think should be part of the metadata of a filesystem, or are you asking if the data you currently chose is normalized? I'd struggle e.g. with "links", as this should/could be calculated if you store/retrieve those links with your directory listing (which is not clear if you want to do that); also you probably need some hierarchy model for your directory, see e.g. here.

    – Solarflare
    Nov 17 '18 at 11:26


















0















Would it be possible to store an ext3-like system in mysql? For example, to build a file metadata system that would include paths and permissions? If so, what would be an example format of it? I suppose the starting denormalized route would be here:



-rw-r--r--@  1 david  staff   552B Nov  2 15:55 wsgi.py
-rw-r--r-- 1 david staff 113B Oct 30 20:45 ~.bash_profile


So we would have 7 columns:



permissions | links | user | group | size | date | filename


I suppose we would need an additional two for 'management':



fileId | directoryId


Does this seem way off base, or would that be possible? What would be a better and more normalized way to store it?










share|improve this question























  • Are you asking to store/manage the result of a directory listing, or are you asking if one could replace ext3 by a sql database?

    – Solarflare
    Nov 17 '18 at 3:04











  • @Solarflare I mean just the directory metadata -- where the files are located and what permissions they have.

    – David L
    Nov 17 '18 at 4:56











  • Sorry, I am still not really sure what you are looking for. Do you want us to add all the information to your erd that we think should be part of the metadata of a filesystem, or are you asking if the data you currently chose is normalized? I'd struggle e.g. with "links", as this should/could be calculated if you store/retrieve those links with your directory listing (which is not clear if you want to do that); also you probably need some hierarchy model for your directory, see e.g. here.

    – Solarflare
    Nov 17 '18 at 11:26














0












0








0








Would it be possible to store an ext3-like system in mysql? For example, to build a file metadata system that would include paths and permissions? If so, what would be an example format of it? I suppose the starting denormalized route would be here:



-rw-r--r--@  1 david  staff   552B Nov  2 15:55 wsgi.py
-rw-r--r-- 1 david staff 113B Oct 30 20:45 ~.bash_profile


So we would have 7 columns:



permissions | links | user | group | size | date | filename


I suppose we would need an additional two for 'management':



fileId | directoryId


Does this seem way off base, or would that be possible? What would be a better and more normalized way to store it?










share|improve this question














Would it be possible to store an ext3-like system in mysql? For example, to build a file metadata system that would include paths and permissions? If so, what would be an example format of it? I suppose the starting denormalized route would be here:



-rw-r--r--@  1 david  staff   552B Nov  2 15:55 wsgi.py
-rw-r--r-- 1 david staff 113B Oct 30 20:45 ~.bash_profile


So we would have 7 columns:



permissions | links | user | group | size | date | filename


I suppose we would need an additional two for 'management':



fileId | directoryId


Does this seem way off base, or would that be possible? What would be a better and more normalized way to store it?







mysql sql database-design ext3






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 17 '18 at 1:14









David LDavid L

150117




150117













  • Are you asking to store/manage the result of a directory listing, or are you asking if one could replace ext3 by a sql database?

    – Solarflare
    Nov 17 '18 at 3:04











  • @Solarflare I mean just the directory metadata -- where the files are located and what permissions they have.

    – David L
    Nov 17 '18 at 4:56











  • Sorry, I am still not really sure what you are looking for. Do you want us to add all the information to your erd that we think should be part of the metadata of a filesystem, or are you asking if the data you currently chose is normalized? I'd struggle e.g. with "links", as this should/could be calculated if you store/retrieve those links with your directory listing (which is not clear if you want to do that); also you probably need some hierarchy model for your directory, see e.g. here.

    – Solarflare
    Nov 17 '18 at 11:26



















  • Are you asking to store/manage the result of a directory listing, or are you asking if one could replace ext3 by a sql database?

    – Solarflare
    Nov 17 '18 at 3:04











  • @Solarflare I mean just the directory metadata -- where the files are located and what permissions they have.

    – David L
    Nov 17 '18 at 4:56











  • Sorry, I am still not really sure what you are looking for. Do you want us to add all the information to your erd that we think should be part of the metadata of a filesystem, or are you asking if the data you currently chose is normalized? I'd struggle e.g. with "links", as this should/could be calculated if you store/retrieve those links with your directory listing (which is not clear if you want to do that); also you probably need some hierarchy model for your directory, see e.g. here.

    – Solarflare
    Nov 17 '18 at 11:26

















Are you asking to store/manage the result of a directory listing, or are you asking if one could replace ext3 by a sql database?

– Solarflare
Nov 17 '18 at 3:04





Are you asking to store/manage the result of a directory listing, or are you asking if one could replace ext3 by a sql database?

– Solarflare
Nov 17 '18 at 3:04













@Solarflare I mean just the directory metadata -- where the files are located and what permissions they have.

– David L
Nov 17 '18 at 4:56





@Solarflare I mean just the directory metadata -- where the files are located and what permissions they have.

– David L
Nov 17 '18 at 4:56













Sorry, I am still not really sure what you are looking for. Do you want us to add all the information to your erd that we think should be part of the metadata of a filesystem, or are you asking if the data you currently chose is normalized? I'd struggle e.g. with "links", as this should/could be calculated if you store/retrieve those links with your directory listing (which is not clear if you want to do that); also you probably need some hierarchy model for your directory, see e.g. here.

– Solarflare
Nov 17 '18 at 11:26





Sorry, I am still not really sure what you are looking for. Do you want us to add all the information to your erd that we think should be part of the metadata of a filesystem, or are you asking if the data you currently chose is normalized? I'd struggle e.g. with "links", as this should/could be calculated if you store/retrieve those links with your directory listing (which is not clear if you want to do that); also you probably need some hierarchy model for your directory, see e.g. here.

– Solarflare
Nov 17 '18 at 11:26












1 Answer
1






active

oldest

votes


















0














Sure, you could crawl the filesystem and collect meta info about each file and directory, and then store it into a database. But that info cannot be a substitute for the filesystem's info, only a 'copy' of it.



I would have (at least) 2 tables: Files and Directories. A file would have a dir_id. A directory would have both a dir_id for itself and a parent_id for walking up the directory tree. The top of the tree (the "root") would be a zero or null.



Soft links, hard links, devices, mounts, /proc, etc, would add challenges, but maybe you don't care about them?



size needs to be a BIGINT. permissions, if encoded, might fit in SMALLINT UNSIGNED, or could be stored as a string. user and group could be either the id or the string; you may need to have an id:name table for users and one for groups. For date, consider TIMESTAMP or maybe DATETIME; keep in mind that the OS may be doing something closer to TIMESTAMP to deal with timezones. (Windows is possibly different.)



If you are storing copies of the files, then I recommend using another table, linked to Files via file_id. But, beware, a LONGBLOB is limited to 4GB, and there are other settings that make it difficult to store anything bigger than 16MB. So, I might suggest chunking anything bigger than 64KB, compressing the chunks, etc. (This probably requires yet another table.)



As for reconstructing the full path from the "hierarchy" that I suggested above, it is only a small amount of code. It could be done in your app code, or in a Stored Procedure. With MySQL 8.0 or MariaDB 10.2, "CTEs" are available to facilitate drilling down a tree.



(Yes, I have done most of these things in a couple of projects in the past.)






share|improve this answer
























  • thanks for this response, it's very helpful. Out of curiosity, why would you need to separate Files from Directories into two tables? For example, couldn't you store it in one table like Entity, or something similar?

    – David L
    Nov 18 '18 at 1:30











  • @DavidL - You have a point. It depends on what info you need. In my projects, I needed enough differences to warrant two tables. With a single table, you probably need a flag saying directory vs file.

    – Rick James
    Nov 18 '18 at 1:44












Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53347311%2fstoring-ext3-in-a-sql-database%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














Sure, you could crawl the filesystem and collect meta info about each file and directory, and then store it into a database. But that info cannot be a substitute for the filesystem's info, only a 'copy' of it.



I would have (at least) 2 tables: Files and Directories. A file would have a dir_id. A directory would have both a dir_id for itself and a parent_id for walking up the directory tree. The top of the tree (the "root") would be a zero or null.



Soft links, hard links, devices, mounts, /proc, etc, would add challenges, but maybe you don't care about them?



size needs to be a BIGINT. permissions, if encoded, might fit in SMALLINT UNSIGNED, or could be stored as a string. user and group could be either the id or the string; you may need to have an id:name table for users and one for groups. For date, consider TIMESTAMP or maybe DATETIME; keep in mind that the OS may be doing something closer to TIMESTAMP to deal with timezones. (Windows is possibly different.)



If you are storing copies of the files, then I recommend using another table, linked to Files via file_id. But, beware, a LONGBLOB is limited to 4GB, and there are other settings that make it difficult to store anything bigger than 16MB. So, I might suggest chunking anything bigger than 64KB, compressing the chunks, etc. (This probably requires yet another table.)



As for reconstructing the full path from the "hierarchy" that I suggested above, it is only a small amount of code. It could be done in your app code, or in a Stored Procedure. With MySQL 8.0 or MariaDB 10.2, "CTEs" are available to facilitate drilling down a tree.



(Yes, I have done most of these things in a couple of projects in the past.)






share|improve this answer
























  • thanks for this response, it's very helpful. Out of curiosity, why would you need to separate Files from Directories into two tables? For example, couldn't you store it in one table like Entity, or something similar?

    – David L
    Nov 18 '18 at 1:30











  • @DavidL - You have a point. It depends on what info you need. In my projects, I needed enough differences to warrant two tables. With a single table, you probably need a flag saying directory vs file.

    – Rick James
    Nov 18 '18 at 1:44
















0














Sure, you could crawl the filesystem and collect meta info about each file and directory, and then store it into a database. But that info cannot be a substitute for the filesystem's info, only a 'copy' of it.



I would have (at least) 2 tables: Files and Directories. A file would have a dir_id. A directory would have both a dir_id for itself and a parent_id for walking up the directory tree. The top of the tree (the "root") would be a zero or null.



Soft links, hard links, devices, mounts, /proc, etc, would add challenges, but maybe you don't care about them?



size needs to be a BIGINT. permissions, if encoded, might fit in SMALLINT UNSIGNED, or could be stored as a string. user and group could be either the id or the string; you may need to have an id:name table for users and one for groups. For date, consider TIMESTAMP or maybe DATETIME; keep in mind that the OS may be doing something closer to TIMESTAMP to deal with timezones. (Windows is possibly different.)



If you are storing copies of the files, then I recommend using another table, linked to Files via file_id. But, beware, a LONGBLOB is limited to 4GB, and there are other settings that make it difficult to store anything bigger than 16MB. So, I might suggest chunking anything bigger than 64KB, compressing the chunks, etc. (This probably requires yet another table.)



As for reconstructing the full path from the "hierarchy" that I suggested above, it is only a small amount of code. It could be done in your app code, or in a Stored Procedure. With MySQL 8.0 or MariaDB 10.2, "CTEs" are available to facilitate drilling down a tree.



(Yes, I have done most of these things in a couple of projects in the past.)






share|improve this answer
























  • thanks for this response, it's very helpful. Out of curiosity, why would you need to separate Files from Directories into two tables? For example, couldn't you store it in one table like Entity, or something similar?

    – David L
    Nov 18 '18 at 1:30











  • @DavidL - You have a point. It depends on what info you need. In my projects, I needed enough differences to warrant two tables. With a single table, you probably need a flag saying directory vs file.

    – Rick James
    Nov 18 '18 at 1:44














0












0








0







Sure, you could crawl the filesystem and collect meta info about each file and directory, and then store it into a database. But that info cannot be a substitute for the filesystem's info, only a 'copy' of it.



I would have (at least) 2 tables: Files and Directories. A file would have a dir_id. A directory would have both a dir_id for itself and a parent_id for walking up the directory tree. The top of the tree (the "root") would be a zero or null.



Soft links, hard links, devices, mounts, /proc, etc, would add challenges, but maybe you don't care about them?



size needs to be a BIGINT. permissions, if encoded, might fit in SMALLINT UNSIGNED, or could be stored as a string. user and group could be either the id or the string; you may need to have an id:name table for users and one for groups. For date, consider TIMESTAMP or maybe DATETIME; keep in mind that the OS may be doing something closer to TIMESTAMP to deal with timezones. (Windows is possibly different.)



If you are storing copies of the files, then I recommend using another table, linked to Files via file_id. But, beware, a LONGBLOB is limited to 4GB, and there are other settings that make it difficult to store anything bigger than 16MB. So, I might suggest chunking anything bigger than 64KB, compressing the chunks, etc. (This probably requires yet another table.)



As for reconstructing the full path from the "hierarchy" that I suggested above, it is only a small amount of code. It could be done in your app code, or in a Stored Procedure. With MySQL 8.0 or MariaDB 10.2, "CTEs" are available to facilitate drilling down a tree.



(Yes, I have done most of these things in a couple of projects in the past.)






share|improve this answer













Sure, you could crawl the filesystem and collect meta info about each file and directory, and then store it into a database. But that info cannot be a substitute for the filesystem's info, only a 'copy' of it.



I would have (at least) 2 tables: Files and Directories. A file would have a dir_id. A directory would have both a dir_id for itself and a parent_id for walking up the directory tree. The top of the tree (the "root") would be a zero or null.



Soft links, hard links, devices, mounts, /proc, etc, would add challenges, but maybe you don't care about them?



size needs to be a BIGINT. permissions, if encoded, might fit in SMALLINT UNSIGNED, or could be stored as a string. user and group could be either the id or the string; you may need to have an id:name table for users and one for groups. For date, consider TIMESTAMP or maybe DATETIME; keep in mind that the OS may be doing something closer to TIMESTAMP to deal with timezones. (Windows is possibly different.)



If you are storing copies of the files, then I recommend using another table, linked to Files via file_id. But, beware, a LONGBLOB is limited to 4GB, and there are other settings that make it difficult to store anything bigger than 16MB. So, I might suggest chunking anything bigger than 64KB, compressing the chunks, etc. (This probably requires yet another table.)



As for reconstructing the full path from the "hierarchy" that I suggested above, it is only a small amount of code. It could be done in your app code, or in a Stored Procedure. With MySQL 8.0 or MariaDB 10.2, "CTEs" are available to facilitate drilling down a tree.



(Yes, I have done most of these things in a couple of projects in the past.)







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 17 '18 at 20:09









Rick JamesRick James

71.3k567106




71.3k567106













  • thanks for this response, it's very helpful. Out of curiosity, why would you need to separate Files from Directories into two tables? For example, couldn't you store it in one table like Entity, or something similar?

    – David L
    Nov 18 '18 at 1:30











  • @DavidL - You have a point. It depends on what info you need. In my projects, I needed enough differences to warrant two tables. With a single table, you probably need a flag saying directory vs file.

    – Rick James
    Nov 18 '18 at 1:44



















  • thanks for this response, it's very helpful. Out of curiosity, why would you need to separate Files from Directories into two tables? For example, couldn't you store it in one table like Entity, or something similar?

    – David L
    Nov 18 '18 at 1:30











  • @DavidL - You have a point. It depends on what info you need. In my projects, I needed enough differences to warrant two tables. With a single table, you probably need a flag saying directory vs file.

    – Rick James
    Nov 18 '18 at 1:44

















thanks for this response, it's very helpful. Out of curiosity, why would you need to separate Files from Directories into two tables? For example, couldn't you store it in one table like Entity, or something similar?

– David L
Nov 18 '18 at 1:30





thanks for this response, it's very helpful. Out of curiosity, why would you need to separate Files from Directories into two tables? For example, couldn't you store it in one table like Entity, or something similar?

– David L
Nov 18 '18 at 1:30













@DavidL - You have a point. It depends on what info you need. In my projects, I needed enough differences to warrant two tables. With a single table, you probably need a flag saying directory vs file.

– Rick James
Nov 18 '18 at 1:44





@DavidL - You have a point. It depends on what info you need. In my projects, I needed enough differences to warrant two tables. With a single table, you probably need a flag saying directory vs file.

– Rick James
Nov 18 '18 at 1:44




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53347311%2fstoring-ext3-in-a-sql-database%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Xamarin.iOS Cant Deploy on Iphone

Glorious Revolution

Dulmage-Mendelsohn matrix decomposition in Python